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PROSTATE-SPECIFIC GENE, PCGEM1, AND METHODS OF USING PCGEM1 
TO DETECT, TREAT, AND PREVENT PROSTATE CANCER 

CROSS REFERENCE TO RELATED APPLICATIONS 

The present application claims the benefit of United States provisional 
application S.N. 60/126,469, filed March 26, 1999, the entire disclosure of which is 
relied upon and incorporated by reference. 

FIELD OF THE INVENTION 

The present invention relates to nucleic acids that are expressed in prostate 
tissue. More particularly, the present invention relates to the first of a family of novel, 
androgen-regulated, prostate-specific genes, PCGEM1, that is over-expressed in 
prostate cancer, and methods of using the PCGEM1 sequence and fragments thereof 
to measure the hormone responsiveness of prostate cancer cells and to detect, 
diagnose, prevent and treat prostate cancer and other prostate related diseases. 

BACKGROUND 

Prostate cancer is the most common solid tumor in American men (1). The 
wide spectrum of biologic behavior (2) exhibited by prostatic neoplasms poses a 
difficult problem in predicting the clinical course for the individual patient (3, 4). 
Public awareness of prostate specific antigen (PSA) screening efforts has led to an 
increased diagnosis of prostate cancer. The increased diagnosis and greater number of 
patients presenting with prostate cancer has resulted in wider use of radical 
prostatectomy for localized disease (5). Accompanying the rise in surgical 
intervention is the frustrating realization of the inability to predict organ-confined 
disease and clinical outcome for a given patient (5, 6). Traditional prognostic 
markers, such as grade, clinical stage, and pretreatment PSA have limited prognostic 
value for individual men. There is clearly a need to recognize and develop molecular 
and genetic biomarkers to improve prognostication and the management of patients 
with clinically localized prostate cancer. As with other common human neoplasia (7), 
the search for molecular and genetic biomarkers to better define the genesis and 
progression of prostate cancer is the key focus for cancer research investigations 
worldwide. 

The new wave of research addressing molecular genetic alterations in prostate 
cancer is primarily due to increased awareness of this disease and the development of 
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newer molecular technologies. The search for the precursor of prostatic 
adenocarcinoma has focused largely on the spectrum of microscopic changes referred 
to as "prostatic intraepithelial neoplasia" (PIN). Bostwick defines this spectrum as a 
histopathologic continuum that culminates in high grade PIN and early invasive 
cancer (8). The morphologic and molecular changes include the progressive 
disruption of the basal cell-layer, changes in the expression of differentiation markers 
of the prostatic secretory epithelial cells, nuclear and nucleolar abnormalities, 
increased cell proliferation, DNA content alterations, and chromosomal and allelic 
losses (8, 9). These molecular and genetic biomarkers, particularly their progressive 
gain or loss, can be followed to trace the etiology of prostate carcinogenesis. 
Foremost among these biomarkers would be the molecular and genetic markers 
associated with histological phenotypes in transition between normal prostatic 
epithelium and cancer. Most studies so far seem to agree that PIN and prostatic 
adenocarcinoma cells have a lot in common with each other. The invasive carcinoma 
more often reflects a magnification of some of the events already manifest in PIN. 

Early detection of prostate cancer is possible today because of the widely 
propagated and recommended blood PSA test that provides a warning signal for 
prostate cancer if high levels of serum PSA are detected. However, when used alone, 
PSA is not sufficiently sensitive or specific to be considered an ideal tool for the early 
detection or staging of prostate cancer (10). Combining PSA levels with clinical 
staging and Gleason scores is more predictive of the pathological stage of localized 
prostate cancer (1 1). In addition, new molecular techniques are being used for 
improved molecular staging of prostate cancer (12, 13). For instance, reverse 
transcriptase - polymerase chain reaction (RT-PCR) can measure PSA of circulating 
prostate cells in blood and bone marrow of prostate cancer patients. 

Despite new molecular techniques, however, as many as 25 percent of men 
with prostate cancer will have normal PSA levels - usually defined as those equal to 
or below 4 nanograms per milliliter of blood (14). In addition, more than 50 percent 
of the men with higher PSA levels are actually cancer free (14). Thus, PSA is not an 
ideal screening tool for prostate cancer. More reliable tumor-specific biomarkers are 



BNSDOCID: <WO 0058470A1_L> 



WO 00/58470 PCT/US00/07906 

3 

needed that can distinguish between normal and hyperplastic epithelium, and the 
preneoplastic and neoplastic stages of prostate cancer. 

Identification and characterization of genetic alterations defining prostate 
cancer onset and progression is important in understanding the biology and clinical 
course of the disease. The currently available TNM staging system assigns the 
original primary tumor (T) to one of four stages (14). The first stage, Tl, indicates 
that the tumor is microscopic and cannot be felt on rectal examination. T2 refers to 
tumors that are palpable but fully contained within the prostate gland. A T3 
designation indicates the cancer has spread beyond the prostate into surrounding 
connective tissue or has invaded the neighboring seminal vesicles. T4 cancer has 
spread even further. The TNM staging system also assesses whether the cancer has 
metastasized to the pelvic lymph nodes (N) or beyond (M). Metastatic tumors result 
when cancer cells break away from the original tumor, circulate through the blood or 
lymph, and proliferate at distant sites in the body. 

Recent studies of metastatic prostate cancer have shown a significant 
heterogeneity of allelic losses of different chromosome regions between multiple 
cancer foci (21-23). These studies have also documented that the metastatic lesion 
can arise from cancer foci other than dominant tumors (22). Therefore, it is critical to 
understand the molecular changes which define the prostate cancer metastasis 
especially when prostate cancer is increasingly detected in early stages (15-21). 

Moreover, the multifocal nature of prostate cancer needs to be considered (22- 
23) when analyzing biomarkers that may have potential to predict tumor progression 
or metastasis. Approximately 50-60% of patients treated with radical prostatectomy 
for localized prostate carcinomas are found to have microscopic disease that is not 
organ confined, and a significant portion of these patients relapse (24). Utilizing 
biostatistical modeling of traditional and genetic biomarkers such as p53 and bcl-2, 
Bauer et al. (25-26) were able to identify patients at risk of cancer recurrence after 
surgery. Thus, there is clearly a need to develop biomarkers defining various stages 
of the prostate cancer progression. 

Another significant aspect of prostate cancer is the key role that androgens 
play in the development of both the normal prostate and prostate cancer. Androgen 
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ablation, also referred to as "hormonal therapy," is a common treatment for prostate 
cancer, particularly in patients with metastatic disease (14). Hormonal therapy aims 
to inhibit the body from making androgens or to block the activity of androgen. One 
way to block androgen activity involves blocking the androgen receptor; however, 
that blockage is often only successful initially. For example, 70-80% of patients with 
advanced disease exhibit an initial subjective response to hormonal therapy, but most 
tumors progress to an androgen-independent state within two years (16). One 
mechanism proposed for the progression to an androgen-independent state involves 
constitutive activation of the androgen signaling pathway, which could arise from 
structural changes in the androgen receptor protein (16). 

As indicated above, the genesis and progression of cancer cells involve 
multiple genetic alterations as well as a complex interaction of several gene products. 
Thus, various strategies are required to fully understand the molecular genetic 
alterations in a specific type of cancer. In the past, most molecular biology studies 
had focused on mutations of cellular proto-oncogenes and tumor suppressor genes 
(TSGs) associated with prostate cancer (7). Recently, however, there has been an 
increasing shift toward the analysis of "expression genetics" in human cancer (27-3 1 ), 
i.e., the under-expression or over-expression of cancer-specific genes. This shift 
addresses limitations of the previous approaches including: 1) labor intensive 
technology involved in identifying mutated genes that are associated with human 
cancer; 2) the limitations of experimental models with a bias toward identification of 
only certain classes of genes, e.g., identification of mutant ras genes by transfection of 
human tumor DNAs utilizing NIH3T3 cells; and 3) the recognition that the human 
cancer associated genes identified so far do not account for the diversity of cancer 
phenotypes. 

A number of studies are now addressing the alterations of prostate cancer- 
associated gene expression in patient specimens (32-36). It is inevitable that more 
reports on these lines are to follow. 

Thus, despite the growing body of knowledge regarding prostate cancer, there 
is still a need in the art to uncover the identity and function of the genes involved in 
prostate cancer pathogenesis. There is also a need for reagents and assays to 
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accurately detect cancerous cells, to define various stages of prostate cancer 
progression, to identify and characterize genetic alterations defining prostate cancer 
onset and progression, to detect micro-metastasis of prostate cancer, and to treat and 
prevent prostate cancer. 

SUMMARY OF THE INVENTION 

The present invention relates to the identification and characterization of a 
novel gene, the first of a family of genes, designated PCGEM1, for Prostate Cancer 
Gene Expression Marker 1 . PCGEM1 is specific to prostate tissue, is androgen- 
regulated, and appears to be over-expressed in prostate cancer. More recent studies 
associate PCGEM1 cDNA with promoting cell growth. The invention provides the 
isolated nucleotide sequence of PCGEM1 or fragments thereof and nucleic acid 
sequences that hybridize to PCGEM1 . These sequences have utility, for example, as 
markers of prostate cancer and other prostate related diseases, and as targets for 
therapeutic intervention in prostate cancer and other prostate related diseases. The 
invention further provides a vector that directs the expression of PCGEM1, and a host 
cell transfected or transduced with this vector. 

In another embodiment, the invention provides a method of detecting prostate 
cancer cells in a biological sample, for example, by using nucleic acid amplification 
techniques with primers and probes selected to bind specifically to the PCGEM1 
sequence. The invention further comprises a method of selectively killing a prostate 
cancer cell, a method of identifying an androgen responsive cell line, and a method of 
measuring responsiveness of a cell line to hormone-ablation therapy. 

In another aspect, the invention relates to an isolated polypeptide encoded by 
the PCGEM1 gene or a fragment thereof, and antibodies generated against the 
PCGEM1 polypeptide, peptides, or portions thereof, which can be used to detect, 
treat, and prevent prostate cancer. 

Additional features and advantages of the invention will be set forth in the 
description which follows, and in part will be apparent from the description, or may 
be learned by practice of the invention. The objectives and other advantages of the 
invention will be realized and attained by the sequences, cells, vectors, and methods 
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particularly pointed out in the written description and claims herein as well as the 
appended drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the scheme for the identification of differentially expressed 
genes in prostate tumor and normal tissues. 

Figure 2 depicts a differential display pattern of mRNA obtained from 
matched tumor and normal tissues of a prostate cancer patient. Arrows indicate 
differentially expressed cDNAs. 

Figure 3 depicts the analysis of PCGEM1 expression in primary prostate 
cancers. 

Figure 4 depicts the expression pattern of PCGEM1 in prostate cancer cell 

lines. 

Figure 5a depicts the androgen regulation of PCGEM1 expression in LNCaP 
cells, as measured by reverse transcriptase PCR. 

Figure 5b depicts the androgen regulation of PCGEM1 expression in LNCaP 
cells, as measured by Northern blot hybridization. 

Figure 6a depicts the prostate tissue specific expression pattern of PCGEM1. 

Figure 6b depicts a RNA master blot showing the prostate tissue specificity of 
PCGEM1. 

Figure 7A depicts the chromosomal localization of PCGEM1 by fluorescent in 
situ hybridization analysis. 

Figure 7B depicts a DAPI counter-stained chromosome 2 (left), an inverted 
DAP1 stained chromosome 2 shown as G -bands (center), and an ideogram of 
chromosome 2 showing the localization of the signal to band 2q32(bar). 

Figure 8 depicts a cDNA sequence of PCGEM1 (SEQ ID NO:l). 

Figure 9 depicts an additional cDNA sequence of PCGEM1 (SEQ ID NO:2). 

Figure 10 depicts the colony formation of NIH3T3 cell lines expressing 
various PCGEM1 constructs. 

Figure 1 1 depicts the cDNA sequence of the promoter region of PCGEM1 
SEQ ID NO:3. 
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Figure 12 depicts the cDNA of a probe, designated SEQ ID NO:4. 
Figure 13 depicts the cDNAs of primers 1-3, designated SEQ lDNOs:5-7, 
respectively. 

Figure 14 depicts the genomic DNA sequence of PCGEML designated SEQ 
1DN0:8. 

Figure 15 depicts the structure of the PCGEM1 transcription unit. 
Figure 16 depicts a graph of the hypothetical coding capacity of PCGEM1. 
Figure 1 7 depicts a representative example of in situ hybridization results 
showing PCGEM1 expression in normal and tumor areas of prostate cancer tissues. 



DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to PCGEM1 , the first of a family of genes, and 
its related nucleic acids, proteins, antigens, and antibodies for use in the detection, 
prevention, and treatment of prostate cancer (e.g., prostatic intraepithelial neoplasia 
(PIN), adenocarcinomas, nodular hyperplasia, and large duct carcinomas) and prostate 
related diseases (e.g., benign prostatic hyperplasia), and kits comprising these 
reagents. 

Although we do not wish to be limited by any theory or hypothesis, 
preliminary data suggest that the PCGEM1 nucleotide sequence may be related to a 
family of non-coding poly A+RNA that may be implicated in processes relating to 
growth and embryonic development (40-44). Evidence presented herein supports this 
hypothesis. Alternatively, PCGEM1 cDNA may encode a small peptide. 

NUCLEIC ACID MOLECULES 

In a particular embodiment, the invention relates to certain isolated nucleotide 
sequences that are substantially free from contaminating endogenous material. A 
"nucleotide sequence" refers to a polynucleotide molecule in the form of a separate 
fragment or as a component of a larger nucleic acid construct. The nucleic acid 
molecule has been derived from DNA or RNA isolated at least once in substantially 
pure form and in a quantity or concentration enabling identification, manipulation, 
and recovery of its component nucleotide sequences by standard biochemical methods 
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(such as those outlined in Sambrook et al., Molecular Cloning: A Laboratory Manual, 
2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989)). 

Nucleic acid molecules of the invention include DNA in both single-stranded 
and double-stranded form, as well as the RNA complement thereof. DNA includes, 
for example, cDNA, genomic DNA, chemically synthesized DNA, DNA amplified by 
PCR, and combinations thereof. Genomic DNA may be isolated by conventional 
techniques, e.g., using the cDNA of SEQ ID NO:l , SEQ ID NO:2, or suitable 
fragments thereof, as a probe. 

The DNA molecules of the invention include full length genes as well as 
polynucleotides and fragments thereof. The full length gene may include the N- 
terminal signal peptide. Although a non-coding role of PCGEM1 appears likely, the 
possibility of a protein product cannot presently be ruled out. Therefore, other 
embodiments may include DNA encoding a soluble form, e.g., encoding the 
extracellular domain of the protein, either with or without the signal peptide. 

The nucleic acids of the invention are preferentially derived from human 
sources, but the invention includes those derived from non-human species, as well. 

Preferred Sequences 

Particularly preferred nucleotide sequences of the invention are SEQ ID NO:l, 
SEQ ID NO:2, and SEQ ID NO: 8, as set forth in Figures 8, 9, and 14, respectively. 
Two cDNA clones having the nucleotide sequences of SEQ ID NO:l and SEQ ID 
NO:2, and the genomic DNA having the nucleotide sequence of SEQ ID NO: 8, were 
isolated as described in Example 2. 

Thus, in a particular embodiment, this invention provides an isolated nucleic 
acid molecule selected from the group consisting of (a) the polynucleotide sequence 
of SEQ ID NO: 1 , SEQ ID NO:2, or SEQ ID NO: 8; (b) an isolated nucleic acid 
molecule that hybridizes to either strand of a denatured, double-stranded DNA 
comprising the nucleic acid sequence of (a) under conditions of moderate stringency 
in 50% formamide and about 6X SSC at about 42°C with washing conditions of 
approximately 60°C, about 0.5X SSC, and about 0.1% SDS; (c) an isolated nucleic 
acid molecule that hybridizes to either strand of a denatured, double-stranded DNA 
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comprising the nucleic acid sequence of (a) under conditions of high stringency in 
50% formamide and about 6X SSC, with washing conditions of approximately 68°C, 
about 0.2X SSC. and about 0.1% SDS; (d) an isolated nucleic acid molecule derived 
by in vitro mutagenesis from SEQ ID NO:l, SEQ ID NO:2, or SEQ ID NO:8; (e) an 
isolated nucleic acid molecule degenerate from SEQ ID NO:l, SEQ ID NO:2, or SEQ 
ID NO:8 as a result of the genetic code; and (f) an isolated nucleic acid molecule 
selected from the group consisting of human PCGEM1 DNA, an allelic variant of 
human PCGEM1 DNA, and a species homolog of PCGEM1 DNA. 

As used herein, conditions of moderate stringency can be readily determined 
by those having ordinary skill in the art based on, for example, the length of the DNA. 
The basic conditions are set forth by Sambrook et al. Molecular Cloning: A 
Laboratory Manual, 2d ed. Vol. 1, pp. 1.101-104, Cold Spring Harbor Laboratory 
Press, (1989), and include use of a prewashing solution for the nitrocellulose filters of 
about 5X SSC, about 0.5% SDS, and about 1 .0 mM EDTA (pH 8.0), hybridization 
conditions of about 50% formamide, about 6X SSC at about 42°C (or other similar 
hybridization solution, such as Stark's solution, in about 50% formamide at about 
42°C), and washing conditions of about 60°C, about 0.5X SSC, and about 0.1% SDS. 
Conditions of high stringency can also be readily determined by the skilled artisan 
based on, for example, the length of the DNA. Generally, such conditions are defined 
as hybridization conditions as above, and with washing at approximately 68°C, about 
0.2X SSC, and about 0.1% SDS. The skilled artisan will recognize that the 
temperature and wash solution salt concentration can be adjusted as necessary 
according to factors such as the length of the probe. 

Additional Sequences 

Due to the known degeneracy of the genetic code, wherein more than one 
codon can encode the same amino acid, a DNA sequence can vary from that shown in 
SEQ ID NO: 1 , SEQ ID NO:2, or SEQ ID NO:8, and still encode PCGEM1 . Such 
variant DNA sequences can result from silent mutations {e.g., occurring during PCR 
amplification), or can be the product of deliberate mutagenesis of a native sequence. 
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The invention thus provides isolated DNA sequences of the invention selected 
from: (a) DNA comprising the nucleotide sequence of SEQ ID NO:l, SEQ ID NO:2, 
or SEQ ID NO:8; (b) DNA capable of hybridization to a DNA of (a) under conditions 
of moderate stringency; (c) DNA capable of hybridization to a DNA of (a) under 
conditions of high stringency; and (d) DNA which is degenerate as a result of the 
genetic code to a DNA defined in (a), (b), or (c). Such sequences are preferably 
provided and/or constructed in the form of an open reading frame uninterrupted by 
internal non-translated sequences, or introns, that are typically present in eukaryotic 
genes. Sequences of non-translated DNA can be present 5' or 3' from an open 
reading frame, where the same do not interfere with manipulation or expression of the 
coding region. Of course, should PCGEM1 encode a polypeptide, polypeptides 
encoded by such DNA sequences are encompassed by the invention. Conditions of 
moderate and high stringency are described above. 

In another embodiment, the nucleic acid molecules of the invention comprise 
nucleotide sequences that are at least 80% identical to a nucleotide sequence set forth 
herein. Also contemplated are embodiments in which a nucleic acid molecule 
comprises a sequence that is at least 90% identical, at least 95% identical, at least 98% 
identical, at least 99% identical, or at least 99.9% identical to a nucleotide sequence 
set forth herein. 

Percent identity may be determined by visual inspection and mathematical 
calculation. Alternatively, percent identity of two nucleic acid sequences may be 
determined by comparing sequence information using the GAP computer program, 
version 6.0 described by Devereux et al. (AW. Acids Res. 12:387, 1984) and available 
from the University of Wisconsin Genetics Computer Group (U WGCG). The 
preferred default parameters for the GAP program include: (1) a unary comparison 
matrix (containing a value of 1 for identities and 0 for non-identities) for nucleotides, 
and the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 
14:6745, 1986, as described by Schwartz and Dayhoff, eds., Atlas of Protein Sequence 
and Structure, National Biomedical Research Foundation, pp. 353-358, 1979; (2) a 
penalty of 3.0 for each gap and an additional 0.10 penalty for each symbol in each 
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gap; and (3) no penalty for end gaps. Other programs used by one skilled in the art of 
sequence comparison may also be used. 

The invention also provides isolated nucleic acids useful in the production of 
polypeptides. Such polypeptides may be prepared by any of a number of conventional 
techniques. A DNA sequence of this invention or desired fragment thereof may be 
subcloned into an expression vector for production of the polypeptide or fragment. 
The DNA sequence advantageously is fused to a sequence encoding a suitable leader 
or signal peptide. Alternatively, the desired fragment may be chemically synthesized 
using known techniques. DNA fragments also may be produced by restriction 
endonuclease digestion of a full length cloned DNA sequence, and isolated by 
electrophoresis on agarose gels. If necessary, oligonucleotides that reconstruct the 5' 
or 3' terminus to a desired point may be ligated to a DNA fragment generated by 
restriction enzyme digestion. Such oligonucleotides may additionally contain a 
restriction endonuclease cleavage site upstream of the desired coding sequence, and 
position an initiation codon (ATG) at the N-terminus of the coding sequence. 

The well-known polymerase chain reaction (PCR) procedure also may be 
employed to isolate and amplify a DNA sequence encoding a desired protein 
fragment. Oligonucleotides that define the desired termini of the DNA fragment are 
employed as 5' and 3' primers. The oligonucleotides may additionally contain 
recognition sites for restriction endonucleases, to facilitate insertion of the amplified 
DNA fragment into an expression vector. PCR techniques are described in Saiki et 
al. ; Science 239:487 (1988); Recombinant DNA Methodology, Wu et al., eds., 
Academic Press, Inc., San Diego (1989), pp. 1 89-196; and PCR Protocols: A Guide 
to Methods and Applications, Innis et al., eds., Academic Press, Inc. (1990). 

USE OF PCGEM1 NUCLEIC ACID OR OLIGONUCLEOTIDES 

In a particular embodiment, the invention relates to PCGEM1 nucleotide 
sequences isolated from human prostate cells, including the complete genomic DNA 
(Figure 14, SEQ ID NO: 8), and two full length cDNAs: SEQ ID NO:l (Figure 8) and 
SEQ ID NO:2 (Figure 9), and fragments thereof. The nucleic acids of the invention, 
including DNA, RNA, mRNA and oligonucleotides thereof, are useful in a variety of 
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applications in the detection, diagnosis, prognosis, and treatment of prostate cancer. 
Examples of applications within the scope of the present invention include, but are not 
limited to: 

amplifying PCGEM1 sequences; 

detecting a PCGEM1 -derived marker of prostate cancer by 
hybridization with an oligonucleotide probe; 
identifying chromosome 2; 
mapping genes to chromosome 2; 

identifying genes associated with certain diseases, syndromes, or other 

conditions associated with human chromosome 2; 

constructing vectors having PCGEM1 sequences; 

expressing vector-associated PCGEM1 sequences as RNA and protein; 

detecting defective genes in an individual; 

developing gene therapy; 

developing immunologic reagents corresponding to PCGEM 1 -encoded 
products; and 

treating prostate cancer using antibodies, antisense nucleic acids, or 
other inhibitors specific for PCGEM 1 sequences. 

Detecting. Diagnosing, and Treating Prostate Cancer 
The present invention provides a method of detecting prostate cancer in a 
patient, which comprises (a) detecting PCGEM1 mRNA in a biological sample from 
the patient; and (b) correlating the amount of PCGEM 1 mRNA in the sample with the 
presence of prostate cancer in the patient. Detecting PCGEM1 mRNA in a biological 
sample may include: (a) isolating RNA from said biological sample; (b) amplifying a 
PCGEM1 cDNA molecule; (c) incubating the PCGEM1 cDNA with the isolated 
nucleic acid of the invention; and (d) detecting hybridization between the PCGEM 1 
cDNA and the isolated nucleic acid. The biological sample can be selected from the 
group consisting of blood, urine, and tissue, for example, from a biopsy. In a 
preferred embodiment, the biological sample is blood. This method is useful in both 
the initial diagnosis of prostate cancer, and the later prognosis of disease. This 
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method allows for testing prostate tissue in a biopsy, and after removal of a cancerous 
prostate, continued monitoring of the blood for micrometastases. 

According to this method of diagnosing and prognosticating prostate cancer in 
a patient, the amount of PCGEM1 mRNA in a biological sample from a patient is 
correlated with the presence of prostate cancer in the patient. Those of ordinary skill 
in the art can readily assess the level of over-expression that is correlated with the 
presence of prostate cancer. 

In another embodiment, this invention provides a vector, comprising a 
PCGEM1 promoter sequence operatively linked to a nucleotide sequence encoding a 
cytotoxic protein. The invention further provides a method of selectively killing a 
prostate cancer cell, which comprises introducing the vector to prostate cancer cells 
under conditions sufficient to permit selective killing of the prostate cells. As used 
herein, the phrase "selective killing" is meant to include the killing of at least a cell 
which is specifically targeted by a nucleotide sequence. The putative PCGEM1 
promoter, contained in the 5' flanking region of the PCGEM1 genomic sequence, 
SEQ ID NO: 3, is set forth in Figure 1 1 . Applicants envision that a nucleotide 
sequence encoding any cytotoxic protein can be incorporated into this vector for 
deliver)' to prostate tissue. For example, the cytotoxic protein can be ricin, abrin, 
diphtheria toxin, p53, thymidine kinase, tumor necrosis factor, cholera toxin, 
Pseudomonas aeruginosa exotoxin A, ribosomal inactivating proteins, or mycotoxins 
such as trichothecenes. and derivatives and fragments (e.g., single chains) thereof. 

This invention also provides a method of identifying an androgen-responsive 
cell line, which comprises (a) obtaining a cell line suspected of being androgen- 
responsive, (b) incubating the cell line with an androgen; and (c) detecting PCGEM1 
mRNA in the cell line, wherein an increase in PCGEM1 mRNA, as compared to an 
untreated cell line, correlates with the cell line being androgen-responsive. 

The invention further provides a method of measuring the responsiveness of a 
prostatic tissue to hormone-ablation therapy, which comprises (a) treating the 
prostatic tissue with hormone-ablation therapy; and (b) measuring PCGEM1 mRNA 
in the prostatic tissue following hormone-ablation therapy, wherein a decrease in 
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PCGEM1 mRNA, as compared to an untreated cell line, correlates with the cell line 
responding to hormone-ablation therapy. 

In another aspect of the invention, these nucleic acid molecules may be 
introduced into a recombinant vector, such as a plasmid, cosmid, or virus, which can 
be used to transfect or transduce a host cell. The nucleic acids of the present invention 
may be combined with other DNA sequences, such as promoters, polyadenylation 
signals, restriction enzyme sites, multiple cloning sites, and other coding sequences. 

Probes 

Among the uses of nucleic acids of the invention is the use of fragments as 
probes or primers. Such fragments generally comprise at least about 17 contiguous 
nucleotides of a DNA sequence. The fragment may have fewer than 17 nucleotides, 
such as, for example, 10 or 15 nucleotides. In other embodiments, a DNA fragment 
comprises at least 20, at least 30, or at least 60 contiguous nucleotides of a DNA 
sequence. Examples of probes or primers of the invention include those of SEQ ID 
NO: 5, SEQ ID NO: 6, and SEQ ID NO: 7, as well as those disclosed in Table I. 
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Table I 



Primer 


Sequence (5'->3') 


S/AS 


OlUI lllJg 

Base# 


SEQ ID NO. 


p413 


TGGCAACAGGCAAGCAGAG 


S 


510 


SEQ ID NO: 


9 


p414 


GGCCAAAATAAAACCAAACAT 


AS 


610 


SEQ ID NO: 


10 


p489 


GCAAATATGATTTAAAGATACAAC 


s 


752 


SEQ ID NO: 


11 


p490 


GGTTGTATCTTTAAATCATATTTGC 


AS 


776 


SEQ ID NO: 


12 


p491 


ACTGTCTTTTCATATATTTCTCAATGC 


S 


559 


SEQ ID NO: 


13 


p517 


AAGTAGTAATTTTAAACATGGGAC 


AS 


1516 


SEQ ID NO: 


14 


p518 


TTTTTCAATTAGGCAGCAACC 


S 


131 


SEQ ID NO: 


15 


p519 


GAATTGTCTTTGTGATTGTTTTTAG 


S 


1338 


SEQ ID NO: 


16 


p560 


CAATTCACAAAGACAATTCAGTTAAG 


AS 


1355 


SEQ ID NO: 


17 


p561 


ACAATTAGACAATGTCCAGCTGA 


AS 


1154 


SEQ ID NO: 


18 


p562 


CTTTGGCTGATATCATGAAGTGTC 


AS 


322 


SEQ ID NO: 


19 


P 623 


AACCTTTTGCCCTATGCCGTAAC 


S 


148 


SEQ ID NO: 


20 


p624 


GAGACTCCCAACCTGATGATGT 


AS 


376 


SEQ ID NO: 


21 


p839 


GGTCACGTTGAGTCCCAGTG 


AS 


270 


SEQ ID NO: 


22 



S/AS indicates whether the primer is Sense or AntiSense 

Starting Base # indicates the starting base number with respect to the sequence of 
SEQ ID NO: 1. 

However, even larger probes may be used. For example, a particularly preferred 
probe is derived from PCGEM1 (SEQ ID NO: 1) and comprises nucleotides 1 16 to 
1 140 of that sequence. It has been designated SEQ ID NO: 4 and is set forth in Figure 
12. 

When a hybridization probe binds to a target sequence, it forms a duplex 
molecule that is both stable and selective. These nucleic acid molecules may be 
readily prepared, for example, by chemical synthesis or by recombinant techniques. A 
wide variety of methods are known in the art for detecting hybridization, including 
fluorescent, radioactive, or enzymatic means, or other ligands such as avidin/biotin. 

In another aspect of the invention, these nucleic acid molecules may be 
introduced into a recombinant vector, such as a plasmid, cosmid, or virus, which can 
be used to transfect or transduce a host cell. The nucleic acids of the present invention 
may be combined with other DNA sequences, such as promoters, polyadenylation 
signals, restriction enzyme sites, multiple cloning sites, and other coding sequences. 
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Because homologs of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 8 from 
other mammalian species are contemplated herein, probes based on the human DNA 
sequence of SEQ ID NO: 1 , SEQ ID NO: 2, and SEQ ID NO: 8 may be used to screen 
cDNA libraries derived from other mammalian species, using conventional cross- 
species hybridization techniques. 

In another aspect of the invention, one can use the knowledge of the genetic 
code in combination with the sequences set forth herein to prepare sets of degenerate 
oligonucleotides. Such oligonucleotides are useful as primers, e.g., in polymerase 
chain reactions (PCR), whereby DNA fragments are isolated and amplified. 
Particularly preferred primers are set forth in Figures 13 and Table I and are 
designated SEQ ID NOS: 5-7 and 9-22, respectively. A particularly preferred primer 
pair is p5 1 8 (SEQ ID NO: 1 5) and p839 (SEQ ID NO: 22), which when used in PCR, 
preferentially amplifies mRNA, thereby avoiding less desirable cross-reactivity with 
genomic DNA. 

Chromosome Mapping 

As set forth in Example 3, the PCGEM1 gene has been mapped by fluorescent 
in situ hybridization to the 2q32 region of chromosome 2 using a bacterial artificial 
chromosome (BAC) clone containing PCGEU1 genomic sequence. Thus, all or a 
portion of the nucleic acid molecule of SEQ ID NO:l , SEQ ID NO:2, and SEQ ID 
NO:8, including oligonucleotides, can be used by those skilled in the art using well- 
known techniques to identify human chromosome 2, and the specific locus thereof, 
that contains the PCGEM1 DNA. Useful techniques include, but are not limited to, 
using the nucleotide sequence of SEQ ID NO:l, SEQ ID NO:2, or SE ID NO:8, or 
fragments thereof, including oligonucleotides, as a probe in various well-known 
techniques such as radiation hybrid mapping (high resolution), in situ hybridization to 
chromosome spreads (moderate resolution), and Southern blot hybridization to hybrid 
cell lines containing individual human chromosomes (low resolution). 

For example, chromosomes can be mapped by radiation hybridization. First, 
PCR is performed using the Whitehead Institute/MIT Center for Genome Research 
Genebridge4 panel of 93 radiation hybrids 
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( http://www-genome.wi.mit.edu/ftp/distributiori/ 

human_STS_releases/july97/rhmap/genebridge4.html). Primers are used which lie 
within a putative exon of the gene of interest and which amplify a product from 
human genomic DNA, but do not amplify hamster genomic DNA. The results of the 
PCRs are converted into a data vector that is submitted to the Whitehead/MIT 
Radiation Mapping site on the internet (http://www-seq.wi.mit.edu). The data is 
scored and the chromosomal assignment and placement relative to known Sequence 
Tag Site (STS) markers on the radiation hybrid map is provided. (The following web 
site provides additional information about radiation hybrid mapping: 
http://www-genome.wi. mit.edu/ftp/distribution/human_STS_releases/july97/ 
07-97.INTRO.html). 

Identifying Associated Diseases 

As noted above, PCGEM1 has been mapped to the 2q32 region of 
chromosome 2. This region is associated with specific diseases, which include but are 
not limited to diabetes mellitus (insulin dependent), and T cell leukemia/lymphoma. 
Thus, the nucleic acids of SEQ ID NO: 1 , SEQ ID NO: 2, or SEQ ID NO:8, or 
fragments thereof, can be used by one skilled in the art using well-known techniques 
to analyze abnormalities associated with gene mapping to chromosome 2. This 
enables one to distinguish conditions in which this marker is rearranged or deleted. In 
addition, nucleotides of SEQ ID NO: L SEQ ID NO:2, or SEQ ID NO:8, or fragments 
thereof, can be used as a positional marker to map other genes of unknown location. 

The DNA may be used in developing treatments for any disorder mediated 
(directly or indirectly) by defective, or insufficient amounts of PCGEM1, including 
prostate cancer. Disclosure herein of native nucleotide sequences permits the 
detection of defective genes, and the replacement thereof with normal genes. 
Defective genes may be detected in in vitro diagnostic assays, and by comparison of a 
native nucleotide sequence disclosed herein with that of a gene derived from a person 
suspected of harboring a defect in this gene. 
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Sense-Antisense 

Other useful fragments of the nucleic acids include antisense or sense 
oligonucleotides comprising a single-stranded nucleic acid sequence (either RNA or 
DNA) capable of binding to target mRNA (sense) or DNA (antisense) sequences. 
Antisense or sense oligonucleotides, according to the present invention, comprise a 
fragment of DNA (SEQ ID NO: 1 , SEQ ID NO:2, or SEQ ID NO:8). Such a fragment 
generally comprises at least about 14 nucleotides, preferably from about 14 to about 
30 nucleotides. The ability to derive an antisense or a sense oligonucleotide, based 
upon a cDNA sequence encoding a given protein is described in, for example, Stein 
and Cohen {Cancer Res. 48:2659, 1 988) and van der Krol et al. (BioTechniques 6:958, 
1988). 

The biologic activity of PCGEM1 in assay cells and the over expression of 
PCGEM1 in prostate cancer tissues suggest that elevated levels of PCGEM1 promote 
prostate cancer cell growth. Thus, the antisense oligonucleotides to PCGEM1 may be 
used to reduce the expression of PCGEM1 and, consequently, inhibit the growth of 
the cancer cells. 

Binding of antisense or sense oligonucleotides to target nucleic acid sequences 
results in the formation of duplexes. The antisense oligonucleotides thus may be used 
to block expression of proteins or to inhibit the function of RNA. Antisense or sense 
oligonucleotides further comprise oligonucleotides having modified sugar- 
phosphodiester backbones (or other sugar linkages, such as those described in 
W09 1/06629) and wherein such sugar linkages are resistant to endogenous nucleases. 
Such oligonucleotides with resistant sugar linkages are stable in vivo (i.e., capable of 
resisting enzymatic degradation) but retain sequence specificity to be able to bind to 
target nucleotide sequences. 

Other examples of sense or antisense oligonucleotides include those 
oligonucleotides which are covalently linked to organic moieties, such as those 
described in WO 90/10448, and other moieties that increases affinity of the 
oligonucleotide for a target nucleic acid sequence, such as poly-(L-lysine). Further 
still, intercalating agents, such as ellipticine, and alkylating agents or metal complexes 
may be attached to sense or antisense oligonucleotides. Such modifications may 
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modify binding specificities of the antisense or sense oligonucleotide for the target 
nucleotide sequence. 

Antisense or sense oligonucleotides may be introduced into a cell containing 
the target nucleic acid sequence by any gene transfer method, including, for example, 
lipofection, CaP0 4 -mediated DNA transfection, electroporation, or by using gene 
transfer vectors such as Epstein-Barr virus or adenovirus. 

Sense or antisense oligonucleotides also may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand 
binding molecule, as described in WO 91/04753. Suitable ligand binding molecules 
include, but are not limited to, cell surface receptors, growth factors, other cytokines, 
or other ligands that bind to cell surface receptors. Preferably, conjugation of the 
ligand binding molecule does not substantially interfere with the ability of the ligand 
binding molecule to bind to its corresponding molecule or receptor, or block entry of 
the sense or antisense oligonucleotide or its conjugated version into the cell. 

Alternatively, a sense or an antisense oligonucleotide may be introduced into a 
cell containing the target nucleic acid sequence by formation of an oligonucleotide- 
lipid complex, as described in WO 90/10448. The sense or antisense oligonucleotide- 
lipid complex is preferably dissociated within the cell by an endogenous lipase. 



POLYPEPTIDES AND FRAGMENTS THEREOF 

The invention also encompasses polypeptides and fragments thereof in various 
forms, including those that are naturally occurring or produced through various 
techniques such as procedures involving recombinant DNA technology. Such forms 
include, but are not limited to, derivatives, variants, and oligomers, as well as fusion 
proteins or fragments thereof. 

The polypeptides of the invention include full length proteins encoded by the 
nucleic acid sequences set forth above. The polypeptides of the invention may be 
membrane bound or they may be secreted and thus soluble. The invention also 
includes the expression, isolation and purification of the polypeptides and fragments 
of the invention, accomplished by any suitable technique. 
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The following examples further illustrate preferred aspects of the invention. 

EXAMPLE 1: Differential Gene Expression Analysis in Prostate Cancer 

Using the differential display technique, we identified a novel gene that is 
over-expressed in prostate cancer cells. Differential display provides a method to 
separate and clone individual messenger RNAs by means of the polymerase chain 
reaction, as described in Liang et al., Science, 257:967-71 (1992), which is hereby 
incorporated by reference. Briefly, the method entails using two groups of 
oligonucleotide primers. One group is designed to recognize the polyadenylate tail of 
messenger RNAs. The other group contains primers that are short and arbitrary in 
sequence and anneal to positions in the messenger RNA randomly distributed from 
the polyadenylate tail. Products amplified with these primers can be differentiated on 
a sequencing gel based on their size. If different cell populations are amplified with 
the same groups of primers, one can compare the amplification products to identify 
differentially expressed RNA sequences. 

Differential display ("DD") kits from Genomyx (Foster City, California) were 
used to analyze differential gene expression. The steps of the differential display 
technique are summarized in Figure 1 . Histologically well defined matched tumor 
and normal prostate tissue sections containing approximately similar proportions of 
epithelial cells were chosen from individual prostate cancer patients. 

Genomic DNA-free total RNA was extracted from this enriched pool of cells 
using RNAzol B (Tel-Test, Inc., Friendswood, TX) according to manufacturer's 
protocol. The epithelial nature of the RNA source was further confirmed using 
cytokeratin 18 expression (45) in reverse transcriptase-polymerase chain reaction (RT- 
PCR) assays. Using arbitrary and anchored primers containing 5' Ml 3 or T7 
sequences (obtained from Biomedical Instrumentation Center, Uniformed Services 
University of the Health Sciences, Bethesda), the isolated DNA-free total RNA was 
amplified by RT-PCR which was performed using ten anchored antisense primers and 
four arbitrary sense primers according to the protocol provided by Hieroglyph™ RNA 
Profile Kit 1 (Genomyx Corporation, CA). The cDNA fragments produced by the 
RT-PCR assay were analyzed by high resolution gel electrophoresis, carried out by 
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using Genomyx™ LR DNA sequencer and LR-Optimized™ HR-1000™ gel 
formulations (Genomyx Corporation, CA). 

A partial DD screening of normal/tumor tissues revealed 30 differentially 
expressed cDNA fragments, with 53% showing reduced or no expression in tumor 
RNA specimens and 47% showing over expression in tumor RNA specimen (Figure 
2). These cDNAs were excised from the DD gels, reamplified using T7 and Ml 3 
primers and the RT PCR conditions recommended in Hieroglyph™ RNA Profile Kit-1 
(Genomyx Corp.. CA), and sequenced. The inclusion of T7 and M13 sequencing 
primers in the DD primers allowed rapid sequencing and orientation of cDNAs 
(Figure 1). 

All the reamplified cDNA fragments were purified by Centricon-c-100 system 
(Amicon, USA). The purified fragments were sequenced by cycle sequencing and 
DNA sequence determination using an AB1 377 DNA sequencer. Isolated sequences 
were analyzed for sequence homology with known sequences by running searches 
through publicly available DNA sequence databases, including the National Center for 
Biotechnology Information and the Cancer Genome Anatomy Project. Approximately 
two-thirds of these cDNA sequences exhibited homology to previously described 
DNA sequences/genes e.g., ribosomal proteins, mitochondrial DNA sequences, 
growth factor receptors, and genes involved in maintaining the redox state in cells. 
About one-third of the cDNAs represented novel sequences, which did not exhibit 
similarity to the sequences available in publicly available databases. The PCGEM1 
fragment, obtained from the initial differential display screening represents a 530 base 
pair (nucleotides 410 to 940 of SEQ ID NO: I) cDNA sequence which, in initial 
searches, did not exhibit any significant homology with sequences in the publicly 
available databases. Later searching of the high throughput genome sequence 
(HTGS) database revealed perfect homology to a chromosome 2 derived 
uncharacterized, unfinished genomic sequence (accession # AC 013401). 

EXAMPLE 2: Characterization of Full Length PCGEM1 cDNA Sequence 

The full length of PCGEM1 was obtained by 5' and 3' RACE/PCR from the 
original 530 bp DD product (nucleotides 4 10 to 940 of PCGEM1 cDNA SEQ ID 
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NO:l) using a normal prostate cDNA library in lambda phage (Clontech, CA). The 
RACE/PCR products were directly sequenced. Lasergene and MacVector DNA 
analysis software were used to analyze DNA sequences and to define open reading 
frame regions. We also used the original DD product to screen a normal prostate 
cDNA library. Three overlapping cDNA clones were identified. 

Sequencing of the cDNA clones was performed on an ABI-310 sequence 
analyzer and a new dRhodamine cycle sequencing kit (PE- Applied Biosystem, CA). 
The longest PCGEM1 cDNA clone, SEQ ID NO:l (Figure 8), revealed 1643 
nucleotides with a potential polyadenylation site, ATTAAA, close to the 3' end 
followed by a poly (A) tail. As noted above, although initial searching of PCGEM1 
gene in publically available DNA databases (e.g., National Center for Biotechnology 
Information) using the BLAST program did not reveal any homology, a recent search 
of the HTGS database revealed perfect homology of PCGEM1 (using cDNA of SEQ 
ID NO: 1) to a chromosome 2 derived uncharacterized, unfinished genomic sequence 
(accession # AC 013401). One of the cDNA clones, SEQ ID NO:2 (Figure 9), 
contained a 123 bp insertion at 278, and this inserted sequence showed strong 
homology (87%) to Alu sequence. It is likely that this clone represented the 
premature transcripts. Sequencing of several clones from RT-PCR further confirmed 
the presence of the two forms of transcripts. 

Sequence analysis did not reveal any significant long open reading frame in 
both strands. The longest ORF in the sense strand was 1 05 nucleotides (572-679) 
encoding 35 amino acid peptides. However, the ATG was not in a strong context of 
initiation. Although we could not rule out the coding capacity for a very small 
peptide, it is possible that PCGEM1 may function as a non-coding RNA. 

The sequence of PCGEM1 cDNA has been verified by several approaches 
including characterization of several clones of PCGEM1 and analysis of PCGEM1 
cDNAs amplified from normal prostate tissue and prostate cancer cell lines. We have 
also obtained the genomic clones of PCGEM1, which has helped to confirm the 
PCGEM1 cDNA sequence. The complete genomic DNA sequence of PCGEM1 
(SEQ ID NO:8) is shown in Figure 14. In Figure 14 (and in the accompanying 
Sequence Listing), "Y" represents any one of the four nucleotide bases, cylosine, 
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thymine, adenine, or guanine. Comparison of the cDNA and genomic sequences 
revealed the organization of the PCGEM1 transcription unit from three exons (Figure 
15: E, Exon; B: BamHl; H: Hindlll; X: Xbal; R: EcoRI). 

EXAMPLE 3: Manning the Location of PCGEM1 

Using fluorescent in situ hybridization and the PCGEM1 genomic DNA as a 
probe, we mapped the location of PCGEM1 on chromosome 2q to specific region 
2q32 (Figure 7A). Specifically, a Bacterial Artificial Chromosome (BAC) clone 
containing the PCGEM1 genomic sequence was isolated by custom services of 
Genome Systems (St. Louis, Mo). PCGEMl-Bac clone 1 DNA was nick translated 
using spectrum orange (Vysis) as a direct label and flourescent in situ hybridization 
was done using this probe on normal human male metaphase chromosome spreads. 
Counterstaining was done and chromosomal localization was determined based on the 
G-band analysis of inverted 4'.6-diamidino-2-phenylindole (DAPI) images. (Figure 
7B: a DAPI counter-stained chromosome 2 is shown on the left; an inverted DAPI 
stained chromosome 2 shown as G-bands is shown in the center; an ideogram of 
chromosome 2 showing the localization of the signal to band 2q32(bar) is shown on 
the right.) NU200 image acquisition and registration software was used to create the 
digital images. More than 20 metaphases were analyzed. 

EXAMPLE 4: Analysis of PCGEM1 Gene Expression in Prostate Cancer 

To further characterize the tumor specific expression of the PCGEM1 
fragment, and also to rule out individual variations of gene expression alterations 
commonly observed in tumors, the expression of the PCGEM1 fragment was 
evaluated on a test panel of matched tumor and normal RNAs derived from the 
microdissected tissues of twenty prostate cancer patients. 

Using the PCGEM1 cDNA sequence (SEQ ID NO:l), specific PCR primers 
(Sense primer 1 (SEQ ID NO: 5): 5' TGCCTCAGCCTCCCAAGTAAC 3' and 
Antisense primer 2 (SEQ ID NO: 6): 5' GGCCAAAATAAAACCAAACAT 3') were 
designed for RT-PCR assays. Radical prostatectomy derived OCT compound (Miles 
Inc. Elkhart, IN) embedded fresh frozen normal and tumor tissues from prostate 
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cancer patients were characterized for histopathology by examining hematoxylin and 
eosin stained sections (46). Tumor and normal prostate tissues regions representing 
approximately equal number of epithelial cells were dissected out of frozen sections. 
DNA-free RNA was prepared from these tissues and used in RT-PCR analysis to 
detect PCGEM1 expression. One hundred nanograms of total RNA was reverse 
transcribed into cDNA using RT-PCR kit (Perkin-Elmer, Foster, CA). The PCR was 
performed using Amplitaq Gold from Perkin-Elmer (Foster, CA). PCR cycles used 
were: 95 °C for 10 minutes, 1 cycle; 95 °C for 30 seconds, 55 °C for 30 seconds, 72 °C 
for 30 seconds, 42 cycles, and 72°C for 5 minutes, 1 cycle followed by a 4°C storage. 
Epithelial cell-associated cytokeratin 1 8 was used as an internal control. 

RT-PCR analysis of microdissected matched normal and tumor tissue derived 
RNAs from 23 CaP patients revealed tumor associated overexpression of PCGEM1 in 
1 3 (56%) of the patients (Figure 5). Six of twenty-three (26%) patients did not exhibit 
detectable PCGEM1 expression in either normal or tumor tissue derived RNAs. 
Three of twenty-three (13%) tumor specimens showed reduced expression in tumors. 
One of the patients did not exhibit any change. Expression of housekeeping genes, 
cytokeratin- 18 (Figure 3) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) 
(data not shown) remained constant in tumor and normal specimens of all the patients 
(Figure 3). These results were further confirmed by another set of PCGEMl specific 
primers (Sense Primer 3 (SEQ ID NO: 7): 5' TGGCAACAGGCAAGCAGAG 3' and 
Antisense Primer 2 (SEQ ID NO: 6): 5' GGCCAAAATAAAACCAAACAT 3')- 
Four of 16 (25%) patients did not exhibit detectable PCGEMl expression in either 
normal or tumor tissue derived RNAs. Two of 16 (12.5%) tumor specimens showed 
reduced expression in tumors. These results of PCGEMl expression in tumor tissues 
could be explained by the expected individual variations between tumors of different 
patients. Most importantly, initial DD observations were confirmed by showing that 
45% of patients analyzed did exhibit over expression of PCGEMl in tumor prostate 
tissues when compared to corresponding normal prostate tissue of the same 
individual. 
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EXAMPLE 5: //; situ Hybridization 

In situ hybridization was performed essentially as described by Wilkinson and 
Green (48). Briefly, OCT embedded tissue slides stored at -80°C were fixed in 4% 
PFA (paraformaldehyde), digested with proteinase K and then again fixed in 4% PFA. 
After washing in PBS, sections were treated with 0.25% acetic anhydride in 0.1 M 
triethanolamine, washed again in PBS, and dehydrated in a graded ethanol series. 
Sections were hybridized with 35 S-labeled riboprobes at 52*C overnight. After 
washing and RNase A treatment, sections were dehydrated, dipped into NTB-2 
emulsion and exposed for 1 1 days at 4°C. After development, slides were lightly 
stained with hematoxylin and mounted for microscopy. In each section, PCGEM1 
expression was scored as percentage of cells showing "S signal: 1+, 1-25%; 2+, 25- 
50%; 3+, 50-75%, 4+, 75-100%. 

Paired normal (benign) and tumor specimens from 13 patients were tested 
using in situ hybridization. A representative example is shown in Figure 17. In 1 1 
cases (84%) tumor associated elevation of PCGEM1 expression was detected. In 5 of 
these 1 1 patients the expression of PCGEM1 increased to 1+ in the tumor area from 
an essentially undetectable level in the normal area (on the 0 to 4+ scale). Tumor 
specimens from 4 of 1 1 patients scored between 2+ (example shown in Figure 1 7B) 
and 4+. Two of 1 1 patients showed focal signals with 3+ score in the tumor area, and 
one of these patients had similar focal signal (2+) in an area pathologically designated 
as benign. In the remaining 2 of the 13 cases there was no detectable signal in any of 
the tissue areas tested. The results indicate that PCGEM1 expression appears to be 
restricted to glandular epithelial cells. (Figure 17 shows an example of in situ 
hybridization of 35 S labeled PCGEM1 riboprobe to matched normal (A) versus tumor 
(B) sections of prostate cancer patients. The light gray areas are hematoxylin stained 
cell bodies, the black dots represent the PCGEM1 expression signal. The signal is 
background level in the normal (A), 2+ level in the tumor (B) section. The 
magnification is 40x.) 
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EXAMPLE 6: PCGEM1 Gene Expression in Prostate Tumor Cell Lines 

PCGEM1 gene expression was also evaluated in established prostate cancer 
cell lines: LNCaP, DU145, PC3 (all from ATCC), DuPro (available from Dr. David 
Paulson, Duke University, Durham, NC), and an E6/E7 - immortalized primary 
prostate cancer cell line, CPDR1 (47). CPDR1 is a primary CaP derived cell line 
immortalized by retroviral vector, LXSN 1 6 E6 E7, expressing E6 and E7 gene of the 
human papilloma virus 1 6. LNCaP is a well studied, androgen-responsive prostate 
cancer cell line, whereas DU145, PC3, DuPro and CPDR1 are androgen-independent 
and lack detectable expression of the androgen receptor. Utilizing the RT-PCR assay 
described above, PCGEM1 expression was easily detectable in LNCaP (Figure 4). 
However, PCGEM1 expression was not detected in prostate cancer cell lines DU145, 
PC3, DuPro and CPDR. Thus, PCGEM1 was expressed in the androgen-responsive 
cell line but not in the androgen-independent cell lines. These results indicate that 
hormones, particularly androgen, may play a key role in regulating PCGEM1 
expression in prostate cancer cells. In addition, the results suggest that PCGEM1 
expression may be used to distinguish between hormone responsive tumor cells and 
more aggressive hormone refractory tumor cells. 

To test if PCGEM1 expression is regulated by androgens, we performed 
experiments evaluating PCGEM1 expression in LNCaP cells (ATCC) cultured with 
and without androgens. Total RNA from LNCaP cells, treated with synthetic 
androgen R1881 obtained from (DUPONT, Boston, MA), were analyzed for 
PCGEM1 expression. Both RT-PCR analysis (Figure 5a) and Northern blot analysis 
(Figure 5b) were conducted as follows. 

LNCaP cells were maintained in RPMI 1640 (Life Technologies, Inc., 
Gaithersburg, MD) supplemented with 10% fetal bovine serum (FBS, Life 
Technologies, Inc., Gaithersburg, MD) and experiments were performed on cells 
between passages 20 and 35. For the studies of NKX3.1 gene expression regulation, 
charcoal/dextran stripped androgen-free FBS (cFBS, Gemini Bio-Products, Inc., 
Calabasas, CA) was used. LNCaP cells were cultured first in RPMI 1640 with 10% 
cFBS for 4 days and then stimulated with a non-metabolizabie androgen analog 
R1881 (DUPONT, Boston, MA) at different concentrations for different times as 
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shown in Figure 5A. LNCaP cells identically treated but without Rl 881 served as 
control. Poly A+ RNA derived from cells treated with/without Rl 881 was extracted 
at indicated time points with RNAzol B (Tel-Test, Inc, TX) and fractionated 
(2ug/lane) by running on 1% formaldehyde-agarose gel and transferred to nylon 
membrane. Northern blots were analyzed for the expression of PCGEM1 using the 
nucleic acid molecule set forth in SEQ ID NO: 4 as a probe. The RNA from LNCaP 
cells treated with R1881 and RNA from control LNCaP cells were also analyzed by 
RT-PCR assays as described in Example 4. 

As set forth in Figures 5a and 5b, PCGEM1 expression increases in response 
to androgen treatment. This finding further supports the hypothesis that the PCGEM1 
expression is regulated by androgens in prostate cancer cells. 

EXAMPLE 7: Tissue Specificity of PCGEM1 Expression 

Multiple tissue Northern blots (Clontech, CA) conducted according to the 
manufacturers directions revealed prostate tissue-specific expression of PCGEM1. 
Polyadenylate RNAs of 23 different human tissues (heart, brain, placenta, lung, liver 
skeletal muscle, kidney, pancreas, spleen, thymus, prostate, testis, ovary, small 
intestine, colon, peripheral blood, stomach, thyroid, spinal cord, lymph node, trachea, 
adrenal gland and bone marrow) were probed with the 530 base pair PCGEM1 cDNA 
fragment (nucleotides 410 to 940 of SEQ 1DN0:1). A 1.7 kilobase mRNA transcript 
hybridized to the PCGEM1 probe in prostate tissue (Figure 6a). Hybridization was 
not observed in any of the other human tissues (Figure 6a). Two independent 
experiments revealed identical results. 

Additional Northern blot analyses on an RNA master blot (Clontech, CA) 
conducted according to the manufacturer's directions confirm the prostate tissue 
specificity of the PCGEM1 gene (Figure 6b). Northern blot analyses reveal that the 
prostate tissue specificity of PCGEM1 is comparable to the well known prostate 
marker PSA (77mer oligo probe) and far better than two other prostate specific genes 
PSMA (234 bp fragment from PCR product) and NKX3.1 (210 bp cDNA). For 
instance, PSMA is expressed in the brain (37) and in the duodenal mucosa and a 
subset of proximal renal tubules (38). While NKX3.1 exhibits high levels of 
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expression in adult prostate, it is also expressed in lower levels in testis tissue and 
several other tissues (39). 



EXAMPLE 8: Biologic functions of the PCGEM1 

The tumor associated PCGEM1 overexpression suggested that the increased 
expression of PCGEM1 may favor tumor cell proliferation. NIH3T3 cells have been 
extensively used to define cell growth promoting functions associated with a wide 
variety of genes (40-44). Utilizing pcDNA3.1/Hygro(+/-)(Invitrogen, CA), PCGEM1 
expression vectors were constructed in sense and anti-sense orientations and were 
transfected into NIH3T3 cells, and hygromycin resistant colonies were counted 2-3 
weeks later. Cells transfected with PCGEM1 sense construct formed about 2 times 
more colonies than vector alone in three independent experiments (Figure 1 0). The 
size of the colonies in PCGEM1 sense construct transfected cells were significantly 
larger. No appreciable difference was observed in the number of colonies between 
anti -sense PCGEM1 constructs and vector controls. These promising results 
document a cell growth promoting/cell survival function(s) associated with PCGEM1 . 

The function of PCGEM1, however, does not appear to be due to protein 
expression. To assess this hypothesis, we used the TestCode program (GCG 
Wisconsin Package, Madison, WI), which identifies potential protein coding 
sequences of longer than 200 bases by measuring the non-randomness of the 
composition at ever}' third base, independently from the reading frames. Analysis of 
the PCGEM1 cDNA sequence revealed that, at greater than 95% confidence level, the 
sequence does not contain any region with protein coding capacity (Figure 16A). 
Similar results were obtained when various published non-coding RNA sequences 
were analyzed with the TestCode program (data not shown), while known protein 
coding regions of similar size i.e., alpha actin (Figure 1 6B) can be detected with high 
fidelity. (In Figure 1 6, evaluation of the coding capacity of the PCGEM1 (A) and the 
human alpha actin (B), is performed independently from the reading frame, by using 
the TestCode program. The number of base pairs is indicated on the X- axis, the 
TestCode values are shown on the Y-axis. Regions of longer than 200 base pairs 
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above the upper line (at 9.5 value) are considered coding, under the lower line (at 7.3 
value) are considered non-coding, at a confidence level greater than 95%.) 

The Codon Preference program (GCG Wisconsin Package, Madison, WI), 
which locates protein coding regions in a reading frame specific manner further 
suggested the absence of protein coding capacity in the PCGEM1 gene (see 
www.cpdr.org). In vitro transcription/translation of PCGEM1 cDNA did not produce 
a detectable protein/peptide. Although we can not unequivocally rule out the 
possibility that PCGEM1 codes for a short unstable peptide, at this time both 
experimental and computational approaches strongly suggest that PCGEM1 cDNA 
does not have protein coding capacity. (It should be recognized that conclusions 
regarding the role of PCGEM1 are speculative in nature, and should not be considered 
limiting in any way. 

The most intriguing aspect of PCGEM1 characterization has been its apparent 
lack of protein coding capacity. Although we have not completely ruled out the 
possibility that PCGEM1 codes for a short unstable peptide, careful sequencing of 
PCGEM1 cDNA and genomic clones, computational analysis of PCGEM1 sequence, 
and in vitro transcription/translation experiments (data not shown) strongly suggest a 
non-coding nature of PCGEM1 . It is interesting to note that an emerging group of 
novel mPvNA-Iike non-coding RNAs are being discovered whose function and 
mechanisms of action remain poorly understood (49). Such KNA molecules have also 
been termed as "RNA riboregulators" because of their function(s) in development, 
differentiation, DNA damage, heat shock responses and tumorigenesis (40-42, 50). In 
the context of tumorigenesis, the HI 9, His-\ and Bic genes code for functional non- 
coding mRNAs (50). In addition, a recently reported prostate cancer associated gene, 
DD3 also appears to exhibit a tissue specific non-coding mRNA (51). In this regard it 
is important to point out that PCGEM1 and DD3 may represent a new class of 
prostate specific genes. The recent discovery of a steroid receptor co-activator as an 
mRNA, lacking protein coding capacity further emphasizes the role of RNA 
riboregulators in critical biochemical function(s) (52). Our preliminary results 
showed that PCGEM1 expression in NIH3T3 cells caused a significant increase in the 
size of colonies in a colony forming assay and suggests that PCGEM1 cDNA confers 



BNSDOC1D: <WO_0058470A1J_» 



WO 00/58470 PCT/US00/07906 

30 

cell proliferation and/or cell survival function(s). Elevated expression of PCGEM1 in 
prostate cancer cells may represent a gain in function favoring tumor cell 
proliferation/survival. On the basis of our first characterization of PCGEMlgene, we 
propose that PCGEM1 belongs to a novel class of prostate tissue specific genes with 
potential functions in prostate cell biology and the tumorigenesis of the prostate gland. 

In summary, utilizing surgical specimens and rapid differential display 
technology, we have identified candidate genes of interest with differential expression 
profile in prostate cancer specimens. In particular, we have identified a novel 
nucleotide sequence, PCGEM1 , with no match in the publicly available DNA 
databases (except for the homology shown in the high throughput genome sequence 
database, discussed above). A PCGEM1 cDNA fragment detected a 1.7 kb mRNA on 
Northern blots with selective expression in prostate tissue. Furthermore, this gene 
was found to be up-regulated by the synthetic androgen, R1881. Careful analysis of 
microdissected matched tumor and normal tissues further revealed PCGEM1 over- 
expression in a significant percentage of prostate cancer specimens. Thus, we have 
provided a gene with broad implications for the diagnosis, prevention, and treatment 
of prostate cancer. 

The specification is most thoroughly understood in light of the teachings of the 
references cited within the specification which are hereby incorporated by reference. 
The embodiments within the specification provide an illustration of embodiments of 
the invention and should not be construed to limit the scope of the invention. The 
skilled artisan readily recognizes that many other embodiments are encompassed by 
the invention. 
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We claim: 

1 . An isolated nucleic acid molecule selected from: 

(a) the polynucleotide sequence of SEQ ID NO:l , SEQ ID NO.2, or 

SEQ ID NO:8; 

(b) an isolated nucleic acid molecule that hybridizes to either strand of 
a denatured, double-stranded DNA comprising the nucleic acid sequence of (a) under 
conditions of moderate stringency in about 50% formamide and about 6X SSC at 
about 42 °C with washing conditions of approximately 60°C, about 0.5X SSC, and 
about 0.1% SDS; 

(c) an isolated nucleic acid molecule that hybridizes to either strand of 
a denatured, double-stranded DNA comprising the nucleic acid sequence of (a) under 
conditions of high stringency in about 50% formamide and about 6X SSC, with 
washing conditions of approximately 68°C, about 0.2X SSC, and about 0.1% SDS; 

(d) an isolated nucleic acid molecule derived by in vitro mutagenesis 
from SEQ ID NO:l, SEQ IDNO:2, or SEQ ID NO:8; 

(e) an isolated nucleic acid molecule degenerate from SEQ ID NO:l, 
SEQ ID NO:2, or SEQ ID NO:8, as a result of the genetic code; and 

(f) an isolated nucleic acid molecule selected from the group consisting 
of human PCGEM1 DNA, an allelic variant of human PCGEM1 DNA, and a species 
homolog ofPCGEMl DNA. 

2. A recombinant vector that directs the expression of the nucleic acid 
molecule of claim 1 . 

3. A host cell transfected or transduced with the vector of claim 2. 

4. The host cell of claim 3 selected from bacterial cells, yeast cells, and 
animal cells. 

5. An isolated nucleic acid molecule comprising the polynucleotide sequence 
selected from SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID 
NO: 7, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID 
NO: 1 3, SEQ ID NO: 1 4, SEQ ID NO: 1 5, SEQ ID NO: 1 6, SEQ ID NO: 1 7, SEQ ID 
NO: 1 8, SEQ ID NO: 1 9, SEQ ID NO: 20, SEQ ID NO: 21 , and SEQ ID NO: 22. 

6. A method of detecting prostate cancer in a patient, the method comprising: 
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(a) detecting PCGEM1 mRNA in a biological sample from the 
patient; and 

(b) correlating the amount of PCGEM 1 mRNA in the sample with 
the presence of prostate cancer in the patient. 

7. The method according to claim 6, wherein step (a) includes: 

(a) isolating RNA from the sample; 

(b) amplifying a PCGEM 1 cDNA molecule; 

(c) incubating the PCGEM 1 cDNA with the nucleic acid according 
to claim 1 or 5; and 

(d) detecting hybridization between the PCGEM 1 cDNA and the 
nucleic acid. 

8. The method according to claim 7, wherein the PCGEM 1 cDNA is 
amplified with at least two nucleotide sequences selected from SEQ ID NO: 5, SEQ 
ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID 
NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID 
NO: 1 7, SEQ ID NO: 1 8, SEQ ID NO: 1 9, SEQ ID NO: 20, SEQ ID NO: 2 1 , and SEQ 
ID NO: 22. 

9. The method according to claim 8, wherein the at least two nucleotide 
sequences are SEQ ID NO: 15 and SEQ ID NO:22. 

10. A method according to claim 6, wherein the biological sample is selected 
from blood, urine, and prostate tissue. 

1 1 . The method according to claim 1 0, wherein the biological sample is 

blood. 

12. A vector, comprising a PCGEM 1 promoter sequence operatively linked to 
a nucleotide sequence encoding a cytotoxic protein. 

13. The vector of claim 12, wherein the PCGEM1 promoter sequence is a 
nucleic acid molecule comprising the polynucleotide sequence of SEQ IDNO:3. 

14. A method of selectively killing a prostate cancer cell, the method 
comprising: 

(a) introducing the vector according to claim 12 to the prostate cancer 
cell under conditions sufficient to permit selective cell killing. 
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1 5. The method according to claim 14, wherein the cytotoxic protein is 
selected from ricin. abrin. diphtheria toxin, p53, thymidine kinase, tumor necrosis 
factor, cholera toxin, Pseudomonas aeruginosa exotoxin A, ribosomal inactivating 
proteins, and mycotoxins. 

16. A method of identifying an androgen-responsive cell line, the method 
comprising: 

(a) obtaining a cell line suspected of being androgen responsive, 

(b) incubating the cell line with an androgen; and 

(c) detecting PCGEM1 mRNA in the cell line, 

wherein an increase in PCGEM 1 mRN A, as compared to an untreated cell 
line, correlates with the cell line being androgen responsive. 

1 7. A method of measuring the responsiveness of a prostate tissue to 
hormone-ablation therapy, the method comprising: 

(a) treating the prostate tissue with hormone ablation therapy; and 

(b) measuring PCGEM 1 mRNA in the prostate tissue following 
hormone ablation therapy, 

wherein a decrease in PCGEM 1 mRNA, as compared to an untreated cell line, 
correlates with the prostate tissue responding to hormone ablation therapy. 
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STRATEGY FOR THE IDENTIFICATION OF 
GENE EXPRESSION ALTERATIONS 
IN PROSTATE CANCER 



OCT EMBEDDED FROZEN 
PROSTATE TUMOR/NORMAL TISSUE 



MAKE 6 M m SERIAL SECTIONS 



HISTOLOGICAL EXAMINATION OF H & E SLIDE 



RNA PREPARATION 



RT- PCR AMPLIFICATION USING ARBITRARY AND ANCHORED 
PRIMER CONTAINING 5' M13 OR T7 SEQUENCES 
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cDNA sequence of PCGEM1 Sea .ID No .1 

AAGGCACTCT GGCACCCAGT TTTGGMCTG CAGTTTTAAA AGTCATAAAT TGAATGAAAA TGATAGCAAA 70 

GGTGGAGGTT TTTAAAGAGC TATTTATAGG TCCCTGGACA GCATCTTTTT TCAATTAGGC AGCAACCTTT 140 

TTGCCCTATG CCGTAACCTG TGTCTGCAAC TTCCTCTAAT TGGGAAATAG TTAAGCAGAT TCATAGAGCT 210 

GMTGATAAA ATTGTACTAC GAGATGCACT GGGACTCAAC GTGACCTTAT CAAGTGAGCA GGCTTGGTGC 280 

ATTTGACACT TCATGATATC ATCCAAAGTG GAACTAAAAA CAGCTCCTGG AAGAGGACTA TGACATCATC 350 

AGGTTGGGAG TCTCCAGGGA CAGCGGACCC TTTGGAAAAG GACTAGAAAG TGTGAAATCT ATTAGTCTTC 420 

GATATGAAAT TCTCTGTCTC TGTAAMGCA TTTCATATTT ACAAGACACA GGCCTACTCC TAGGGCAGCA 490 

AAAAGTGGCA ACAGGCAAGC AGAGGGAAAA GAGATCATGA GGCATTTCAG AGTGCACTGT CTTTTCATAT 560 

ATTTCTCAAT GCCGTATGTT TGGTTTTATT TTGGCCAAGC ATAACAATCT GCTCAAGAAA AAAAAATCTG 630 

GAGAAAACAA AGGTGCCTTT GCCAATGTTA TGTTTCTTTT TGACAAGCCC TGAGATTTCT GAGGGGAATT 700 

CACATAAATG GGATCAGGTC ATTCATTTAC GTTGTGTGCA AATATGATTT AAAGATACAA CCTTTGCAGA 770 

GAGCATGCTT TCCTAAGGGT AGGCACGTGG AGGACTAAGG GTAAAGCATT CTTCAAGATC AGTTAATCAA 840 

GAAAGGTGCT CTTTGCATTC TGAMTGCCC TTGTTGCAAA TATTGGTTAT ATTGATTAAA TTTACACTTA 910 
ATGGAAACAA CCTTTAACTT ACAGATGAAC AAACCCACAA AAGCAAAAAA TCAAAAGCCC TACCTATGAT 980 

TTCATATTTT CTGTGTAACT GGATTAAAGG ATTCCTGCTT GCTTTTGGGC ATAAATGATA ATGGAATATT 1050 

TCCAGGTATT GTTTAAAATG AGGGCCCATC TACAAATTCT TAGCAATACT TTGGATAATT CTAAAATTCA 1120 

GCTGGACATT GTCTAATTGT TTTTTATATA CATCTTTGCT AGAATTTCAA ATTTTAAGTA TGTGAATTTA 1190 

GTTAATTAGC TGTGCTGATC AATTCAAAAA CATTACTTTC CTAAATTTTA GACTATGAAG GTCATAAATT 1260 

CAACAAATAT ATCTACACAT ACAATTATAG ATTGTTTTTC ATTATAATGT CTTCATCTTA ACAGAATTGT 1330 

CTTTGTGATT GTTTTTAGAA AACTGAGAGT TTTAATTCAT AATTACTTGA TCAAAAAATT GTGGGAACAA 1400 

TCCAGCATTA ATTGTATGTG ATTGTTTTTA TGTACATAAG GAGTCTTAAG CTTGGTGCCT TGAAGTCTTT 1470 

TGTACTTAGT CCCATGTTTA AAATTACTAC TTTATATCTA AAGCATTTAT GTTTTTCAAT TCAATTTACA 1540 

TGATGCTAAT TATGGCAATT ATAACAAATA TTAAAGATTT CGAAATAGAA AAAAAAAAAA AAA 1603 
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graft sequence of PCGEM1 Seo. ID No .2 

GCGGCCGCGT CGACGCAACT TCCTCTAATT GGGAAATAGT TMGCAGATT CATAGAGCTG AATGATAAAA 70 

T'TGTACTTCG AGATGCACTG GGACTCAACG TGACCTTATC AAGTGAGATG GAGTCTTGCC CTGTCTCCAA 140 

GGCTGGAGCC CAATGGTGTG ATCTTGGCTC ACTGCAACCT CCACCTCCCA GGTTCAAACG TTTCTCCTGC 210 

CTCAGCCTCC CAAGTAACTG GGATTACAGC AGGCTTGGTG CATTTGACAC TTCATGATAT CAGCCAAAGT 280 

GGMCTAAAA ACAGCTCCTG GAAGAGGACT ATGACATCAT CAGGTTGGGA GTCTCCAGGG ACAGCGGACC 350 

CTTTGGAAAA GGACTAGAAA GTGTGAAATC TATTAGTCTT CGATATGAAA TTCTCTGTCT CCGTAAAAGC 420 

ATTTCATATT TACAAGACAC AGGCCTACTC CTAGGGCAGC AAAAAGTGGC AACAGGCAAG CAGAGGGAAA 490 

AGAGATCATG AGGCATTTCA GAGTGCACTG TCTTTTCATA TATTTCTCAA TGCCGTATGT TTGGTTTTAT 560 

TTTGGCCAAG CATAACAATC TGCTCAAAAA AAAAAAATCT GGAGAAAACA AAGGTGCCTT TGCCAATGTT 630 

ATCTTTCTTT TTGACAAGCC CTGAGATTTC TGAGGGGAAT TCACATAAAT GGGATCAGGT CATTCATTTA 700 

CGTTGTGTGC AAATATGATT TAAAGATACA ACCTTTGCAG AGAGCATGCT TTCCTAAGGG TAGGCACGTG 770 

GAGGACTAAG GGTAAAGCAT TCTTCAAGAT CAGTTAATCA AGAAAGGTGC TCTTTGCATT CTGAAATGCC 840 

CTTGTTGCAA ATATTGGTTA TATTGATTAA ATTTACACTT AATGGAAACA ACCTTTAACT TACAGATGAA 910 

CAAACCCCAC AAAAGCAAAA AATCAAAAGC CCTACCTATG ATTTCATATT TTCTGTGTAA CTGGATTAAA 980 

GGATTCCTGC TTGCTTTTGG GCATAAATGA TAATGGAATA TTTCCAGGTA TTGTTTAAAA TGAGGGCCCA 1050 

TCTACAAATT CTTAGCAATA CTTTGGATAA TTCTAAAATT CAGCTGGACA TTGTCTAATT GTTTTTTATA 1120 

TACATCTTTG CTAGAATTTC AAATTTTAAG TATGTGAATT TAGTTAATTA GCTGTGCTGA TCAATTCAAA 1190 

AACATTACTT TCCTAAATTT TAGACTATGA AGGTCATAAA TTCAACAAAT ATATCTACAC ATACAATTAT 1260 

AGATTGTTTT TCATTATAAT GTCTTCATCT TAACAGAATT GTCTT'TGTGA TTGTTTTTAG AAAACTGAGA 1330 

GTTTTAATTC ATAATTACTT GATCAAAAAA TTGTGGGAAC AATCCAGCAT TAATTGTATG TGATTGTTTT 1400 

TATGTACATA AGGAGTCTTA AGCTTGGTGC CTTGAAGTCT TTTGTACTTA GTCCCATGTT TAAAATTACT 1470 

ACTTTATATC TAAAGCATTT ATGTTTTTCA ATTCAATTTA CATGATGCTA ATTATGGCAA TTATAACAAA 1540 

TATTAAAGAT TTCGAAATAG AAAAAAAAAA AAAAATCTA 1579 

FIG. 9 
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cDNA sequence of PCGEM1 Promoter Region Sep. ID No .3 
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TCCCTCTTGC GTTCTGCAAT TTCTGA,\AAA AAGATGTTTA 
CTAATTCCAA TTTGATTCTA ATTGGATGAG TGACATGGGT 
GTAGTATGGA ATTTAATTAG TTCTCAGTAT GTTAGTGAAG 
ATTATAAATA TTTTAAAATG CAAAAAATTA TTCTAATGAA 
ACAATACCAC CCATAAAGTC ATCATCTAAT TTAAAAACTA 
AACATCTTTC CCGACTTGTG TGTTTTTTTC TTTTGCTTTT 
AGATTATATA GCTTTCCTTG TTTTAAGCTT TTTAAATAAT 
TTTTTTACTT AACATTATGG TTCTAAAATT CAGTAATGTG 
CTCTTTGACA TTCGACTATA TAAATTTCAG TTTGTTTATT 
TCTGTTTTTG CTGTTACAAA AATAATGCTG TTTTAAATTT 
TGAGTTATTC TAAGGTAAAA AAATAAGAAA AAATTGCTGG 
AAGATAATGC CAAATCATTT TTCAAAGTAA TTATACCTAT 
CACATAGTTG CTTGTTCTGC CAAAGTTTGG TATGATCGAA 
ATAAAATCTC AGTGTGCTTT TAATTTGCAT TTTCTATGTT 
ATTTACTTTT GCTGAAATGC TTGCTTATTA TTTTTGCTCC 
TTAATTTATA AGAATTTTAT ATGGTTTAGA TACTAATTAT 
TTGTGTACTT TCTACTTTAT GTCTTGTGAT GGATAAAAGT 
TTTAAATTTT ATAATCAGCA TCTTTAATAA TCTCTTTMTA 
TACATCTCTA TAATTTCTTA TTTTTTTGGC ATATGTTCAT 
TTGCAGTTAT TTATGAAACA AAT'AATTTTT AAAATTATAT 
CTTCACTATG AAGCTTGAGG CTTCACTGCA CGTTGTACTG 
TCTCTGAGTT CATGACACCT TTAGTGTCTC AGGTTTTTTT 
CCTAAGTTAA ATAAAAACAA AGCACAAAGC TATCAGCTTC 
AGTTGTAACT TGCCTGGTGC CCAATAGATG TCACTCTGTT 
CTCCTGTTAA TTCATGGTAG TGCCCCAAGG CACTCTGGCA 
ATAAATTGAA TGAAAATGAT AGCAAAGGTG GAGGTTTTTA 

FIG. 
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cDNA sequence of PCGEM1 PROBE Sea. ID No. 4 

TTTTTTCAAT TAGGCAGCAA CCTTTTTGCC CTATGCCGTA ACCTGTGTCT GCAACTTCCT CTAATTGGGA 70 

AATAGTTAAG CAGATTCATA GAGCTGAATG ATAAAATTGT ACTACGAGAT GCACTGGGAC TCAACGTGAC 140 

CTTATCAAGT GAGCAGGCTT GGTGCATTTG ACACTTCATG ATATCATCCA AAGTGGAACT AAAAACAGCT 210 

CCTGGAAGAG GACTATGACA TCATCAGGTT GGGAGTCTCC AGGGACAGCG GACCCTTTGG AAAAGGACTA 280 

GAAAGTGTGA AATCTATTAG TCTTCGATAT GAAATTCTCT GTCTCTGTAA AAGCATTTCA TATTTACAAG 350 

ACACAGGCCT ACTCCTAGGG CAGCAAAAAG TGGCAACAGG CAAGCAGAGG GAAAAGAGAT CATGAGGCAT 420 

TTCAGAGTGC ACTGTCTTTT CATATATTTC TCAATGCCGT ATGTTTGGTT TTATTTTGGC CAAGCATAAC 490 

AATCTGCTCA AGAAAAAAAA ATCTGGAGAA AACAAAGGTG CCTTTGCCAA TGTTATGTTT CTTTTTGACA 560 

AGCCCTGAGA TTTCTGAGGG GAATTCACAT AAATGGGATC AGGTCATTCA TTTACGTTGT GTGCAAATAT 630 

GATTTAAAGA TACAACCTTT GCAGAGAGCA TGCTTTCCTA AGGGTAGGCA CGTGGAGGAC TAAGGGTAAA 700 

GCATTCTTCA AGATCAGTTA ATCAAGAAAG GTGCTCTTTG CATTCTGAAA TGCCCTTGTT GCAAATATTG 770 

GTTATATTGA TTAAATTTAC ACTTAATGGA AACAACCTTT AACTTACAGA TGAACAAACC CACAAAAGCA 840 

AAAAATCAAA AGCCCTACCT ATGATTTCAT ATTTTCTGTG TAACTGGATT AAAGGATTCC TGCTTGCTTT 910 

TGGGCATAAA TGATAATGGA ATATTTCCAG GTATTGTTTA AAATGAGGGC CCATCTACAA ATTCTTAGCA 980 

ATACTTTGGA TAATTCTAAA ATTCAGCTGG ACATTGTCTA ATTGT 1025 

FIG. 12 
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PCGEM1 Primers Used for PCR 

PCR PRIMER 1 (SEP ID No . 5 ) 

Sense Primer 5 1 TGCCTCAGCCTCCCAAGTAAC 3 ' 

PCR PRIMER 2 (SEP ID No. 6) 

Antisense Primers 5' GGCCAAAATAAAACCAAACAT 3' 

PCR PRIMER 3 (SEP ID No . 7 } 

Sense Primer 5 ' TGGCAACAGGCAAGCAGAG 3 ' 

FIG. 13 
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Complete Genomic DNA sequence of PCGEM1 gene. 

TCCCTCTTGCGTTCTGCAATTTCTGAAAAAAAGATGTTTATTGCAAAGTGATATGAGCACTGGAAAGGTACTAATTCCAA 

TTTGATTCTAATTGGATGAGTGACATGGGTAAGCGATTCTAAGCATTTGTGTTTTTTTTAGTAGTATGGAATTTAATTAG 

TTCTCAGTATGTTAGTGMGATGMTGAAJ^CATGCATATGTTTCCATGTATTATAMTATTTTAAAATGCAAAAAATTA 

TTCTAATGAATATATAMTATAMGCATMCMTMTMTACMTACCACCCATAMGTCATCATCTAATTTAAAAACTA 

AMCATTAACACTTGAATCTCCCCCATTGCAACATCTTTCCCGACTTGTGTGTTTTTTTCTTTTGCTTTTAAAATTTTTG 

TTTTATCATATGTCTGCATAAGATTATATAGCTTTCCTTGTTTTAAGCTTTTTAAATAATATATTGTAGTTATATTATTT 

GTGCTTTGCTTTTTTTACTTMCATTATGGTTCTAAMTTCAGTAATGTGTTGGGCATGTATAATTTGTTTATTTTTAAT 

CTCTTTGACATTCGACTATATAAATTTCAGTTTGTTTATTGACTCCTTTGTCTATACATACTCTGCTATTTCTGTTTTTG 

CTGTTACAAMTMTGCTGTTTTAMTTTCATTTTGTATACTTTTTTGAGGCATGTGTATGAGTTATTCTAAGGTAAAA 

AMTMGAAAAMTTGCTGGGTTATMGATTGTCACATGCTCGMTTTACAAGATMTGCCAMTCATTTTTCAAAGTAA 

TTATACCTATTTATACTACCGGTATGAGTATATTGGTGCCCACATAGTTGCTTGTTCTGCCAAAGTTTGGTATGATCGAA 

CAATAATTTTTGCCCATCAAATGGCATAAAATAAAATCTCAGTGTGCTTTTAATTTGCATTTTCTATGTTTAAGAATTGT 

TTCTTTTTTAACCATTTATAATTTACTTTTGCTGAAATGCTTGCTTATTATTTTTGCTCCCCATTTTTTCCTATTGGATT 

GCTTTTCTCATTAATTTATAAGAATTTTATATGGTTTAGATACTAATTATTATATTACTGAAAATACCTTTATCAGTTTG 

TTGTGTACTTTCTACTTTATGTCTTGTGAT(^ATAAMGTTTTAAATTGTATTGTCTTGAAGTTAACATTTTTAAATTTT 

ATAATCAGCATCTTTAATAATCTCTTTATAAAATTTTCCTTTACATAGATGTCATAAAGATACATCTCTATAATTTCTTA 

TTTTTTTGGCATATGTTCATTMGTCATTTTATCATTTTTOGTMTAMTOCAGTTATTTATGAMCAAATAATTTTT 

AAAATTATATATGCTTTCTTTAAAAATTGATCTTAGCATGCTTCACTATGAAGCTTGAGGCTTCACTGCACGTTGTACTG 

TTGTTTTTTGTCACAMGCACCTMGTTAiATAAAAACAAAGCACAAAGCTATCAGCTTCATGTATTAAGTAGTAAGCTC 

CCATGTTAACAGTTGTAACTTGCCTGGTGCCCAATAGATGTCACTCTGTTTTCCTAGAAACTTTAAAATATCCCTCAGTG 

CTCCTGTTAATTCATGGTAGTGCCCCAAGGCACTCTGGCACCCAGTTTTGGAACTGCAGTTTTAAAAGTCATAAATTGAA 

TGAAAATGATAGCAAAGGTGGAGGTTTTTAAAGAGCTATTTATACCTCCCTGGACAGCATCTTTTTTCAATTAGGCAGCA 

ACCTTTTTGCCTATGCCGTAACTGTGTCTGCACTTCCTCTAATTGGGGTGAGTAAGAGATTTTGTTATGTATATAATAGC 

TAAGAATATAGTAATAATCCCTTAAATCATGGTTATTTTTAAACTACTAACATTTAGAAGACAAAATAAAAATGCTTTGA 

AAAGTATAGAGGTTTTAGTGTAATTAGCAGGGAATAATGAAATGATTTGATAGGGCTACTCAGTTTTGTATAACTTTGGT 

GCTTTAAGTCTGAATGCAGAGCATGGATGTTGTGATCCAGCCTTTATATGTTTTCCCTGAAGAAGATTTAATTTATTTGG 

CCTTTTGAGAAACACATTTGGCATTGTAATATGTTTTGCTTCCAGGTTCTATCTCCAAGGATAATTTGACAAAATCACAC 

ATAAATTTATTTTCAGGGCACACAGTTTCCCTTTTAGGGAACTCACAGAGGTAGAGAGTAATACAATAATCACATTTGAA 

TATTCAGTAAGTGAGGTCCTCATAGATCTTATGTGTATGTCACCATGTATATAATTTTGTTAATCACTAGATGTATGAGA 

CMGAAATTTGAGGAATCTTAACTAGAGATTAAAATCAGGGATTTAAATCAAAGAAACATTTAAATGCCTCCTTTATTAT 

TTAMTACCTGCATGGGAGMTCATTGAAAAWlMTAmGCATACMCTTGGGMTATTATAMCCMGMGM 

GTTATTCTGGTTGATTTTTTTTTCAGGCTCCGCACAGGCAACTTACCTTTATCTCTTTGTGATTTTTATTTCTTGTTAAA 

ATATACAGAAATAGTTAAGCAGATTCATAGAGCTGAATATAAAATTTACTACGAGATGCACTGGGACTCAACGTGACCTT 

ATCAAGTGACTTATCAGTGAGGTGAGCATTCTTAATTCAGATAATGGAACTTATTATCATAATCTTTTGCTTATGCTATT 

GTTGAGCTTAACTACTTATTCATATTTGCATATGCATATTGAGATAATATCATTTCATTAATTTCAGTACTGAACACTAA 

TCTCCTAAGAGTAATTGTGAAAGTTTCAGATTGCACTATTTTTAACTATATATCTGTATGTTATCTTCATATATGCTTGA 

ATMCTTATMGCMTTGAMCTTTCMTTACAGTATACTATTGMGCAAATCAACAAATATATACACATATCCATTAGC 

AATAGTAGATAATTTTTGTAAATGTCCAGCACAGTTCTTCATATGTAGAGGATGTTCAAATTGGCTAAGTTCCTTTTCTC 

TCTTAATTATTAGTATTTTTCCTACTGCTCTTTGTATAATTATTCCTTCCTCTTTAGCTCCAATCCTTACAATCTATTCT 

FIG. 14 
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TAACATAGCMCTGGGMGAAAGTTTTTA^ 
GTCACACATAGMGAAJ&AAAAAAMTATTC 
TTCATGTAATTTCCAGCCACTAGGCCTTi^TGGCTCTCCTTCMTCTCA^ 
GTTTTGGCCGAGGTATirCTTTTTW^ 

TtATTTCTATATACGTGCTAAAlGGTTTCCTTGTCCAMTAGCWAGTGACCACC 

MGTTTCTTTTCCTTTTCCTCACCACTOATATTTATATCAM 

GTACTAGCATTATGATGACCATACTATTTGATGCCCft^ 

TAMTT'ATATMTTTTGACATAGGCACTATTGACAAAMGCAATTGATCTTATC 

TATTTAAMGTMTTCTCTGAAATACMTTTOT 

ATATTTATCACT(XM3ATTTAMTAGTO 

TATCCAACTCTMTATAATGCCACT(£TAT^ 

TTATCTTAMTGAAMTTTTTGGTM^ 

TTTAGT1X3AGAAMTAATTTTTCTCTAGAGMTGMGTAGCTO 

ATCMCTCTTATTTTCTTCMTACGMTATATM 

MTCATMTTTCTGACAMTATTTTGGMGTCAAMCTTGTCTTCTATTTTGTTAT 

TAMCCTTTATACTATCAMTCATAGGCMTTTCAGTTTGATTTCATTCTGGTGCAGAATATMGTTTATC 

CAGGAGTGACTTCAAAJiGATTCCTCCCACTGACTGAGATATTCCAAAGCCMCTTTGCAAMTTO 

TACTTCTTTGTACCTTCATTTTATTTGTTCMTTTTTCTTTGTC 

TmTmMmACTACmATAATTTTTAMGGTAAG 

ATACMTTAATTTTGAGMCTGCAATAJ^J^ 

MTGCAGTACCAGTAGACTACATTTAGGCTGCTTAMGTTAGTTCTTCTMGTACCATATACmmTTTTAGCTAA 

TGATGGAGMCAMGACAGAMGACTGTGTTACCATATTCTAGTTGGCCATTTTGTTT 

GCCmTCATAmTTATTTGGTTTTACCATTmACTGTGAGCAAMTATACAGCATMTATACAAAATAAAATACAT 

GTACATCTTCACMCTTCTTGTTTAGGATGCMTTATATATATATATATATATATATTTA 

GGGTACATGGCACCACGTGCAGGTTGTT'ACATATGTATACATGTGCCATGTTGGTGTGCTGCACCCATTAACTCGTCATT 

TACATTAGGTGTATCTCCTAATGCTATCCCTCCCCTCTCTCCCCACCCCACAACAAGCCCCGGTGTGTGATGTTCCCCTT 

CCTGTGTCCATGTGTTCTCATCTTCMTTCCC 

GmCTGAGMTGATGGTTTCCAGCTTCATCCATGTCCCTACmGACATGAACTCATCATTTTTTATGGCTGCATAG 
TATTCCATGGTGTATATGTGCCACCATTTTCTTMTCCGAGTCTGTCCATTGTTGTTGGACATTTGG 
GTTTCATGTGTAGCATGTATAGCACMCCMTTMGATTTCTTTCTTTCTCTCTTTTOTTO 
GTCTTGCCTGTCTCCMGGCTGGAGCCCMTGGTGTGATCTTGGCTTACTO 

CTCCTGCCTCAGCCATCCGAGTAGCmACTATAGGCGTGCACCACCATGCCCAGCTMTTTOTATTTTTAGTACAG 

ACGGGGTTTCACCACGGTGGCCAGGATGGTCTCMTTTCTTGACCTCATGATTCACCCGCCTTGGCCTCCC 

GGATTACAG6TGTGAACCACCAAGCCCGGCCTGTCACAAGTTTTTAGTGTTCTATTTTMTACAG 

AAAGAGAMGACATTTCATATGTGCGTAGAGTTGTC^ 

MTAAJIGGCAAMTAGTCCTATGCAGTTOATTTMTATATTCTTMTMGAGCTACT^ 

AMCATCTAGATATGGATCTTCATTAGTGACTGACATMTATATTGTTATTGTTACTA 

TATTGAGTGCTTTGTGTATCCTMGCACTATGCTAJ^CAC 
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ATTTCACTTTTCATATGAAAAAATTGAAGCACAGATTAAGACACTCCGAAATCATACCTCTATTGATTATCAGCACCAGG 

ATTTGAATTGAGGCACTCTGATCCAGAGAAGCTTTTGTTTCCATGAAGGCTTATGTTGGGGAAAAATAATCAAATTGCCT 

GTACCTCAGTTGTATAMTAAGAGGTTGGGTTGGTAGATGATTCTGGCTGATTCAGCAGAAAAGAAATTTATTCAAAGGA . 

TATCACACAGTTTTCATAACAGTTAAGAATACAGAGGAAACAGGGCACCAGGGCTAAGTACAGACCAAAGTCCAAAACCA 

CTGCCAAAGTTGCAGCAAGGAGAACAGCACAAATTTGCTTGCTGTCACCCGCCACTAGATGCTTTTGTTTGGAGCCTTGA 

ACTTGACTTACACTGCCACTGACATCAGCACCAGTGCTCTCTGTGTACTAGGAGGTGGAGTTGGTGACGTTGCTGAACTA 

AAAGCAGATGTTTCTGCTGTGAAATAGATACCTAATACAGAACCTGATTCCTCATTCATTCCCTCCCCAAATCATATGCT 

TGTAGTGTGGCTAGAGTTTCTGTTTCTCCTTGGTCCAGGCAGAATTTATGAAGCTTGCTATTTATCGCCTTAAAGATTAG 

AAGAATATTCATAAGGTATTAGATTGCCATAAGGTTGAACAAATCAACATTCAACTTCAAGGATTCAACATTGTTTTGTT 

TTCTTTTGGGATACCTCTGCAGCAGTTCAAATCTTATTTCTGCCCTTGGACAACCAGGTTTATAAATATTGCAGATTCTC 

CACTGACTGCTTTGATCCTATCTTCTATATTTATGTATACTAATTAGCATATAATAAAAGATTATGTTACAGAATCTCAA 

AATTAGTAATTATGAATTGAGATGGTGTTATACAGTACACTAACATCCAAGAGACTTGTTTATTCCAAGGAAAATATTTA 

GAGATATTAAATGATATTTCTCATCCTTTAGACATATACATTTTTTAGCTTACAGCCTGCTTTAGGCAAGCAACAGACTC 

TCAGGATCTGCTCCTACCAGGGTCTGAACATTTCCTCCCAGTTTTAAAGAAACAAATTCAAATAACATTGTAACCTCCAG 

AGGAAAGTTCAAGGTCTTTTATAGTATTGTTTAAACAGTACAGCTGAGGAAACTAAAGACAGAGAAGTTAAATGCCTTGG 

CACTTAGTCTAGATTTACAATAAACTCCTYTCTACTTAGGACCCACTAACAGGGGCTGCATTTACACCAAAACCATGAAG 

GTGGCCCAAGTCATCACTGAGAAGTAGTACAAGCACCGAGGGAATGACTTCAACAGGAACAAGAAAGCGTGGAAGGAGAT 

CCTAGCAGGAAGCTCCACAAGAAGATAGCATGTTACGTCTTGCATTGGATGAAGCAGGTTCAGAGAGACCTAGTGACAGC 

TATCTCCGTCAAGGTGCAGAAGGAGAGATCATTGAATGTAGCATTTTCATGCAAAAAAAAAAATGTTGAAGTCTTTGGAC 

TTCGGGAGTCTGTCCAAACTGCAGGTCACTCAGCCTACAGTTGGGATGAATTTCAAAACACCAGTTGGAGCCGGTTGAAT 

CTTTCTGCTATGCTGTAATATTTTCAGTAAACCCAGCGCAACAACAACAACAAAACACAAAAGGAGGAGAAGCAGCCAAG 

TCTCTTGGTTTACAGAGTAGCTCCTAATACCCCTTGCTGTCTGTCTCAAGTGCCCAATGGGAAGATAGTCAAAACAATAT 

TCACACCTGTGATTCATCTCTCTACATGCAGTGTGTGTGAATCTTTATATACTGCATATTAAGGATCTGTCTTTACAGAT 

AAAMCTAAAGCATTGAAGGAACTCCTTGTTTTGACTTATCAAAGTCCTTAAGAAAATACTAGAAAATTATAGCCATTGT 

TTCAAATTTTAGCTTTATATTATCACTTGAAATGTGATGAAATGTGGCTGATAGATMTAATTCACTGATAACCTACAGA 

CMTTCCCATCTTAAMTGGACCATTGGATTGAAGAATTAAATAAAATTGAGGGTTTTCCTTACATGTTTTGTCTAAAGA 

GCGMGTAGMCMCTGTTCATAGATCTTCATTGAGGATTCGCATGTGAAGTAAGTACTCCTAACATAAACAAGTGGAC 

TTATCMCCAAGTTCCATMTCATGMCAAAMTATTTGTCCCCAGAGAGACTATTTTTCCACCACATCTCTTGTAATA 

AACACAGAGCCCAGTTCAGTTAAAATACTTTAAGGGTGGACGGTTCAGGGCCTGCTGAGTGGCACTCAGTAAGAAAACCC 

AGCAGAACATTTACTTCTCTCTTTATTCCAGAGCATCAATGGCCAAGGCTGGAAGATCCCAGAACACTGAACAGACATTT 

GGTCTCTTATGGCCTGCCAATTTTCACAGTGGGTTCCAACGCTTTGGGTCAAACCAAAATAGACCTGTTAGAAAAATGTC 

GGTTGGAATACGCTAACAATAAGACAGAATAAATGTGATTATTTCACCTCATTTTTATAGGACTTGAGTAATTTTATTAT 

MCATTCTTGAGGGCTGGAAAATCTGAATGTTAGGACACCAAATATCTCCAGAAAACAAGTTTTATATTTCTAATCCTGC 

ATMTMCCTG(X3GCCACTGCAGGCCTCATTMTAAAMCCTAAT(^TATMCMTAATGAGGAGGAAATGCCAATGCC 

GCACMTCTGTTGAGACTAAMTATTTCTCACCCCAGCAGGCTTGGTGCATTTGACACTTCATGATATCAGCCAAAGTG 

GAACTAAAAACAGCTCCTGGAAGAGGACTATGACATCATCAGGTTGGGAGTCTCCAGGGACAGCGGACCCTTTGGAAAAG 

GACTAGAMGTGTGAMTCTATTAGTCTTCGATATGAMTTCTCTGTCTCTGTCAAAAGCATTTCATATTTACAAGACAC 

AGGCCTACTCCTAGGGCAGCAAAAAGTGGCAACAGGCAAGCAGAGGGAAAAGAGATCATGAGGCATTTCAGAGTGCACTG 
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TCTTTTCATATATTTCTCAATGCCGTATGTTTGGTTT^ 

GGAGAAAACAAA(XHGCCTTTGCCAATGTTATGTTC 

GGGATCAGGTCATTCATTTACGTTGTGTKAMTATGATTTAM 

?A(£CACGTGGAGGACTMGGGTAMGCATTCTTCMGMTCAGTTMTCM 

CCCTTGTTKAAATATTKTTATATTGATTAAATTTACAOT 

CAAMGCAAAAJUiGCAAAAGCCCGACCTATGATTTCATATTTTCTG 

GGCATMTGATMTGGAATATTTCCAGGTATTGTTTAMTGAGGGCCCATCTACAMTTCTTAGCAATACTTTGGATA 
ATTCTAAMTTCAGCTGGACATTGTCTAATTGTTTTTTATATACATC 
TTAGTTAATTAGCTGTGCTGATCMTTCAAMCATTACTTTCCT'AJ^TTTTA 
TATATCTACACATACMTTATAGATTGTT^ 

GmCTGAGAGTTTTMTTCATMTTACGTTGATCAAmTTGTGGGAACMTCCAGCATTMTOTATGTGATTGTT 

TTTATGTACATMGGAGTCTTMGCTTGGTGCCTTC 

TCTAAAGttmATGTTTTTCAATTCAATO 

AGMTATGTGAATTGTTCACCATACATAGAMTGAAMGTTCATTTCGTAMGCMGATGCT^ 

GATTGAJ^GATtACTAGATTAGTAGAGGGCmCTmAGTCCCTMTCTACCCTTMTAGCCATGTGGTCACGTGTM 

GTCAGTGMCCCATCTCATTCTCCTCATACTTTTTTCATCTCTAAMTGAGGGTATMTTTMGCTCGTTCATTTTTTTT 

TTTTTTTGAGATAGAGTTTTGCTCTTGTCACCCAGGTTGGAGTGCAATGGCACGATCTCAGCTCACTGCAACCCTCTGCT 

TCCTCGGTTCAAGTGATTCTCCCTGCTTCAGCCTCCCAAGTGAGCCCGGGATTACAGGTGCCCGCCACCACATCTGGGCC 

TAGATTTTTTGTATTTTCACCATGTTGGCCAGGCTGGTCTCGMCCCCTACCTCAGGTGATCCCTCGCCTCGGCCTCTCA 

AAGTGCTGGGATTACAGGTGTGAGCCACCACGCCCAGCCCAATATCAGTTTTTCTTTTTTAACACAAGGCTAACACAATC 

AAMTACTAGCTAGGGGAG^A^AMTMGGCACTGTOTGTGTMCAGGCTCTTGTTGCAATCCACTGGGGCAGA 

CCAJlATMCAGTMGMTtAMTCCTTTTCATATMTCCTTTCTTTGCAGMTACATAAMTCCCCACAMTGGCTTAT 

CTTCCTTTTTATGATATGTTGGAGAATTGTAGCTAAGTGACAGATATTTTGCTTGGGTGTATAGACCACAAAGGACTGTG 

TCTTGATGATGGTTtGCATmTTATACCTTAGTTmCTTTGTATGTTACATGTTAGATTTAGAGTATGmTTAG 

TAGGGAGGATTATTMCAAAGMCAGGGCMGAGGAGTAGMTTAMCCTCTTCTMTACCTGTGCACMGTAGGCTTTT 

CAGMCTCTACMCCCCMCATAMCmTAGmGAAJ^GCACACTCCCMGGMGGCGGTTATGTTTTGCAGTTTG 

MTCAGMGMTAGAGCTATAGCMTCTTCATTCTATAGTMCATTAMGAGCCTGGTTTATATTATAGCAGTCATTM 

ATTTAAMTTTACATCTTGCCGTTCTTCTTACTCACAGATTTTCGAGAGGTAATGTAATGATCACACGAGGTGAGAATC 

ACTGCCTTTTATMTGCGATTAMTGCATGMCAMGTTTCCAACAAATAACAGTAATAAAAAGAAACATGTATTAGCAC 

TTMTMGCCAGGTGCTCTACGACGTGTGTTACATGCTTTCAATCCATGAACTGGTAMCTGGTACTAGTATCTCTATO 

GACATGTGAGGAAACCMTGGAGTTGATMCAGTAGAGTTAAAMTTACTCTTCATATATTATATO 

CAGACATCTCTGCTACCAAAAGCTATCATATCTAGACTCGA 
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TESTCODE OF: vslnuc ck: 6724, 1 to: 1588 
WINDOW: 200 bp MARCH 14, 1999 20:25 
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SEQUENCE LISTING 

<110> Srikanta.n, Vasantha 
Zou, Zhicriang 
Moui, Judd w. 
Srivascava, Shiv 



<120> PROSTATE - SPECIFIC GENE, PCGEM1 , AND METHODS OF USING 
PCGEM1 TO DETECT, TREAT, AND PREVENT PROSTATE CANCER 



<130> 4995.0053-003-04 



<140> 
<141> 



<150> 60/126,469 
<151> 1999-03-26 



<160> 22 



<170> Patentln Ver. 2.1 



<210> 1 

<211> 1603 

<212> DNA 

<213> Homo sapiens 



<400> 1 

aaggcactct ggcacccagt tttggaactg cagttttaaa agtcataaat tgaatgaaaa 60 
cgatagcaaa ggtcgaggtt tttaaagagc tatttatagg tccctggaca gcaccttttt 120 
tcaactaggc agcaaccttt tcgccctatg ccgcaacctg tgcctgcaac ttcctctaat 180 
tgggaaatag ttaagcagat tcacagagct gaacgataaa attgtactac gagatgcact 240 
gggactcaac gcgaccttat caagtgagca ggcctggtgc atttgacact tcatgatatc 300 
atccaaagtg gaactaaaaa cagctcctgg aagaggacta tgacatcatc aggttgggag 360 
tccccaggga cagcggaccc tttggaaaag gactagaaag tgtgaaatct attagtcttc 420 
gatatgaaat tctccgtctc cgtaaaagca tttcatattt acaagacaca ggcctactcc 480 
tagggcagca aaaagtggca acaggcaagc agagggaaaa gagatcatga ggcatttcag 54 0 
agtgcactgt cttttcatat atttctcaat gccgtatgtt tggctttatt ttggccaagc 600 
ataacaatct gctcaagaaa aaaaaatctg gagaaaacaa aggtgccttt gccaatgtta 660 
tgtttctttt cgacaagccc cgagacttct gaggggaatt cacataaatg ggatcaggtc 720 
attcatttac gttgtgtgca aatatgattt aaagatacaa cctttgcaga gagcatgctt 780 
tcctaagggt aggcacgtgg aggactaagg gtaaagcatt cttcaagatc agttaatcaa 840 
gaaaggtgct ctctgcattc cgaaacgccc ctgttgcaaa cattggttat attgattaaa 900 
tctacactta atggaaacaa cctttaactt acagacgaac aaacccacaa aagcaaaaaa 960 
ccaaaagccc tacctatgac ttcatatttt ctgtgtaact ggattaaagg attcccgctt 1020 
gcttttgggc ataaatgaca atggaatatc tccaggtatt gtttaaaatg agggcccacc 1080 
tacaaattct tagcaatact ctggataatt ctaaaatcca gctggacatt gtctaattgt 1140 
tttttatata cacccctgcc agaatttcaa attttaagta cgtgaatcta gtcaactagc 1200 



1 
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tgtgctgatc aattcaaaaa cattaccttc 
caacaaatat atctacacat acaattatag 
acagaattgt ctttgtgatt gtttttagaa 
tcaaaaaatt gtgggaacaa tccagcatta 
gagtcttaag cttggtgcct tgaagccrtt 
CCtatatcta aagcatttat gtttttcaat 
ataacaaata ttaaagattt cgaaacagaa 
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ctaaatttta gactatgaag gtcataaatt 1260 
attgtttttc attacaatgt cttcatctta 1320 
aactgagagt tttaattcat aattacttga 1380 
atcgtatgtg attgttttta tgtacataag 1440 
cgtacctagt cccatgttta aaactactac 1500 
tceattcaca tgatgctaat tatggcaatt 1560 
aaaaaaaaaa aaa 1603 



<210> 2 

<211> 1579 

<212> DNA 

<213> Homo sapiens 

<400> 2 

gcggccgcgt cgacgcaact ccctctaatt gggaaatagt taagcagatt catagagctg 60 
aatgataaaa ttgtacttcg agatgcactg ggactcaacg tgaccttatc aagtgagatg 120 
gagtcttgcc ctgcctccaa ggctggagcc caatggtgtg atcttggctc actgcaacct 180 
ccacctccca ggttcaaacg tttctcctgc ctcagcctcc caagtaactg ggattacagc 240 
aggcttggtg catttgacac ttcatgatat cagccaaagt ggaactaaaa acagctcctg 300 
gaagaggact atgacatcat caggttggga gtctccaggg acagcggacc ctttggaaaa 360 
ggactagaaa gtgtgaaatc tattagtctt cgatatgaaa ttctctgtct ccgtaaaagc 420 
atttcatatt tacaagacac aggcctactc ctagggcagc aaaaagtggc aacaggcaag 480 
cagagggaaa agagatcatg aggcatttca gagtgcactg tcttttcata tatttctcaa 540 
tgccgtatgc ttggttttat tttggccaag cataacaatc tgctcaaaaa aaaaaaatct 600 
ggagaaaaca aaggtgcctt tgccaatgtt atgrttcttt ttgacaagcc ctgagatttc 660 
tgaggggaat tcacataaat gggatcaggt cattcattta cgttgtgtgc aaatatgatt 720 
taaagataca acctttgcag agagcatgct ttcccaaggg taggcacgtg gaggactaag 780 
ggtaaagcat tcttcaagat cagttaatca agaaaggtgc tctttgcatt ctgaaatgcc 840 
cttgttgcaa atattggtta tattgattaa atttacactt aatggaaaca acctttaact 900 
tacagacgaa caaaccccac aaaagcaaaa aatcaaaagc cctacctatg atttcatatt 960 
ttctgtgtaa ctggattaaa ggattcctgc ttgcttttgg gcataaatga taatggaata 1020 
tttccaggta ttgtttaaaa tgagggccca tctacaaatt cttagcaata ctttggataa 1080 
ttctaaaatt cagctggaca ttgtctaatt gttttttata tacatctttg ctagaatttc 1140 
aaattttaag tatgtgaatt tagttaatta gctgtgctga tcaattcaaa aacattactt 1200 
tcctaaattt tagactatga aggtcataaa ttcaacaaat atatctacac atacaattat 1260 
agattgtttt tcattataat gtcttcatct taacagaatt gtctttgtga ttgtttttag 1320 
aaaactgaga gttttaattc ataattactt gatcaaaaaa ttgtgggaac aatccagcat 1380 
taattgtatg tgattgtttt tatgtacata aggagtctta agcttggtgc cttgaagtct 1440 
tttgtactta gtcccatgtt taaaattact actttatatc taaagcattt atgtttttca 1500 
attcaattta catgatgcta attatggcaa ttataacaaa tattaaagat ttcgaaatag 1560 
aaaaaaaaaa aaaaatcta 1579 



<210> 3 

<211> 1819 

<212> DNA 

<213> Homo sapiens 

2 
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<400> 3 

tccctcttgc gttctgcaat ttccgaaaaa aagatgtcta ttgcaaagtg atatgagcac 60 
cggaaaggta ccaattccaa tctgacccta attgcatgag cgacacgggt aagcgatccc 120 
aagcattcgt gtttttttta gcagcacgga atttaattag cccccagtat gtcagcgaag 180 
atgaatgaaa acatgcatat gtttccatgt attacaaata ctttaaaatg caaaaaatta 240 
ttctaatgaa tatataaata taaagcataa caacaacaac acaataccac ccataaagcc 300 
atcatctaat ttaaaaacta aaacaccaac actcgaatct cccccattgc aacatctttc 360 
ccgacttgtg tgtttttttc ttttgctttt aaaatttttg ttttatcata tgtctgcata 420 
agattatata gctttccttg ctttaagctt tttaaataat acatcgtagt tatattattt 480 
gtgctttgct ttctttactt aacattatgg ttctaaaatt cagtaatgtg ttgggcatgt 540 
ataatttgtt tatttttaat ctctttgaca ttcgactata taaattccag tttgtttatc 600 
gactcctttg tctatagata ctctgctatt tctgtttttg ctgtcacaaa aataatgccg 660 
tttcaaactt cattttgtat acttttttga ggcatgtgta tgagttattc taaggtaaaa 720 
aaataagaaa aaattgctgg gttataagat tgtcacatgc ccgaatctac aagataatgc 780 
caaatcattt ttcaaagcaa ttatacctat ttatactacc ggtatgagta tattggtgcc 840 
cacatagctg cccgttccgc caaagcttgg tatgatcgaa caataatttt tgcccatcaa 900 
atggcataaa ataaaatctc agtgtgcttt taatttgcat tttctatgtt taagaattgt 960 
ttctttttta accatttata atttactttt gctgaaacgc ttgcctatta tttttgcccc 1020 
ccattttttc ctattggatt gcttttctca ctaatttata agaattttat atggtttaga 1080 
tactaartat tatattactg aaaatacctt tatcagtttg ttgtgtactt tctactttat 1140 
gtcttgtgat ggataaaagt tttaaattgt attgtgttga agttaacatt tttaaatttt 1200 
ataatcagca tctttaataa tctctttmta aaattttcct ttacatagat gtcataaaga 1260 
tacatctcta caatttctta tttttttggc atatgttcac taagtcattt tatcattttt 1320 
tagtaataaa ttgcagttat ttatgaaaca aataattttt aaaattatat atgctttctt 1380 
taaaaattga tcttagcatg cttcactatg aagcttgagg cttcactgca cgttgtactg 1440 
aaatcacgta taaaacagtg gttctgaaaa tctctgagtt catgacacct ttagtgtctc 1500 
aggttttttt gcttttgttc ttgttttttc tcacaaagca cctaagttaa ataaaaacaa 1560 
agcacaaagc tatcagcttc atgtattaag tagtaagctc ccatgttaac agttgtaact 1620 
tgcctggtgc ccaatagatg tcactctgtt ttcctagaaa ctttaaaata tccctcagtg 1680 
ctcctgttaa ttcatggtag tgccccaagg cactctggca cccagttttg gaactgcagt 1740 
tttaaaagtc ataaattgaa cgaaaatgat agcaaaggtg gaggttttta aagagctatt 1800 
tataggtccc cggacagca 1819 



<210> 4 

<211> 1025 

<212> DNA 

<213> Homo sapiens 

<400> 4 

ttttttcaat caggcagcaa cctttttgcc ctatgccgta acctgtgtct gcaacttcct 60 

ctaattggga aacagtcaag cagattcata gagctgaatg ataaaatcgt actacgagat 120 

gcactgggac tcaacgtgac cttatcaagt gagcaggctt ggtgcatctg acacttcatg 180 

atatcatcca aagtggaact aaaaacagcc cctggaagag gactatgaca tcatcaggtt 240 

gggagtctcc agggacagcg gaccctttgg aaaaggacta gaaagtgtga aatctattag 300 

tcttcgatat gaaattctct gtctctgtaa aagcatttca tattcacaag acacaggcct 360 

actcctaggg cagcaaaaag tggcaacagg caagcagagg gaaaagagat catgaggcat 420 
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ttcagagtgc actgtctttt catatatttc tcaatgccgt atgtttggtt ttattttggc 480 
caagcataac aatctgctca agaaaaaaaa atctggagaa aacaaaggtg cctttgccaa 540 
tgttatgttt ctttttgaca agccctgaga tttctgaggg gaattcacat aaatgggatc 600 
aggccattca tttacgttgt gtgcaaatac gatttaaaga tacaaccttt gcagagagca 66C 
tgctttccta agggtaggca cgtggaggac taagggtaaa gcattcttca agatcagtca 720 
atcaagaaag gtgctctttg cattctgaaa tgcccttgtt gcaaatattg gttatattga 780 
ttaaatttac acttaatgga aacaaccttt aacttacaga tgaacaaacc cacaaaagca 840 
aaaaatcaaa agccctacct atgatttcat attttctgtg taactggatt aaaggattcc 900 
tgcttgcttt tgggcataaa tgataatgga atatttccag gtattgttta aaatgagggc 960 
ccatctacaa attcttagca atactttgga taattctaaa attcagctgg acattgtcta 1020 
attgt 1025 



<210> 5 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 5 

tgcctcagcc tcccaagtaa c 21 



<210> 6 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 6 

ggccaaaata aaaccaaaca t 



<210> 7 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/ Primer 

<400> 7 

tggcaacagg caagcagag 19 
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<210> 8 
<211> 11801 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> unsure 
<222> (7470) 

<223> Y may represent any of the four nucleotide bases 
<400> 8 

tccctcttgc gttctgcaat ttctgaaaaa aagatgttta ttgcaaagtg atatgagcac 60 
tggaaaggta ctaattccaa tttgattcta attggatgag tgacatgggt aagcgattct 120 
aagcatttgt gtttttttta gtagtatgga atttaattag ttctcagtat gttagtgaag 180 
atgaatgaaa acatgcatat gtttccatgt attataaata ttttaaaatg caaaaaatta 240 
ttctaatgaa tatataaata taaagcataa caataataat acaataccac ccataaagtc 300 
atcatctaat ttaaaaacta aaacattaac acttgaatct cccccattgc aacatctttc 360 
ccgacttgtg tgtttttttc ttttgctttt aaaatttttg ttttatcata tgtctgcata 420 
agattatata gctttccttg ttttaagctt tttaaataat atattgtagt tatattattt 480 
gtgctttgct ttttttactt aacattatgg ttctaaaatt cagtaatgtg ttgggcatgt 540 
ataatttgtt tatttttaat ctctttgaca ttcgactata taaatttcag tttgtttatt 600 
gactcctttg tctatagata ctctgctatt tctgtttttg ctgttacaaa aataatgctg 660 
ttttaaattt.cattttgtat acttttttga cgcatgtgta tgagttattc taaggtaaaa 720 
aaataagaaa aaattgctgg gttataagat tgtcacatgc tcgaatttac aagataatgc 780 
caaatcattt ttcaaagtaa ttatacctat ttatactacc ggtatgagta tattggtgcc 840 
cacatagttg cttgttctgc caaagtttgg tatgatcgaa caataatttt tgcccatcaa 900 
atggcataaa ataaaatctc agtgtgcttt taatttgcat tttctatgtt taagaattgt 960 
ttctttttta accatttata atttactttt gctgaaatgc ttgcttatta tttttgctcc 1020 
ccattttttc ctattggatt gcttttctca ttaatttata agaattttat atggtttaga 1080 
tactaattat tatattactg aaaatacctt tatcagtttg ttgtgtactt tctactttat 1140 
gtcttgtgat ggataaaagt tttaaattgt attgtgttga agttaacatt tttaaatttt 1200 
ataatcagca tctttaataa tctctttata aaattttcct ttacatagat gtcataaaga 1260 
tacatctcta taatttctta tttttttggc atatgttcat taagtcattt tatcattttt 1320 
tagtaataaa ttgcagttat ttatgaaaca aataattttt aaaattatat atgctttctt 1380 
taaaaattga tcttagcatg cttcactatg aagcttgagg cttcactgca cgttgtactg 144 0 
aaattatgta taaaacagtg gttctgaaaa tctctgagtt catgacacct ttagtgtctc 1500 
aggttttttt gcttttgttc ttgttttttc tcacaaagca cctaagttaa ataaaaacaa 1560 
agcacaaagc tatcagcttc atgtattaag tagtaagctc ccatgttaac agttgtaact 1620 
tgcctggtgc ccaatagatg tcactctgtt ttcctagaaa ctttaaaata tccctcagtg 1680 
ctcctgttaa ttcatggtag tgccccaagg cactctggca cccagttttg gaactgcagt 1740 
tttaaaagtc ataaattgaa tgaaaatgat agcaaaggtg gaggttttta aagagctatt 1800 
tataggtccc tggacagcat cttttttcaa ttaggcagca acctttttgc ctatgccgta 1860 
actgtgtctg cacttcctct aattggggtg agtaagagat tttgttatgt atataatagc 1920 
taagaatata gtaataatgg cttaaatcat ggttattttt aaactactaa catttagaag 1980 
acaaaataaa aatgctttga aaagtataga ggttttagtg taattagcag ggaataatga 2040 
aatgatttga tagggctact cagttttgta taactttggt gctttaagtc tgaatgcaga 2100 
gcatggatgt tgtgatccag cctttatatg ttttccctga agaagattta atttatttgg 2160 

5 
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ccttttgaga aacacatttg gcattgtaat 
ataatttgac aaaatcacac ataaatttat 
actcacagag gtagagagta atacaataat 
catagatctt atgtgtatgt caccatgtat 
caagaaattt gaggaatctt aactagagac 
ttaaatgcct cctttattat ttaaatacct 
aagcatacaa cttgggaata ttataaacca 
tttcaggctc cgcacaggca acttaccttt 
atatacagaa atagttaagc agattcatag 
ctgggactca acgtgacctt atcaagtgac 
ataatggaac ttattatcat aatcttttgc 
catatttgca tatgcatatt gagataatat 
tctcctaaga gtaattgtga aagtttcaga 
ttatcttcat atatgcttga ataacttata 
attgaagcaa atcaactaat atatacacat 
aatgtccagc acagttcttc atatgtagag 
tcttaattat tagtattttt cctactgctc 
caatccttac aatctattct taacatagca 
gatgatgtca ctccacccca caaaacttcc 
aaaaaaatat tgaaaaccta caaagacttg 
ttcatgtaat ttccagccac taggcctttc 
actactacaa gttagactgg gttttggccg 
gcctagattg ctcttccaat agatattcac 
aaaggtttcc ctgtccaaaa tagcctcagt 
aagtttcttt tccttttcct caccacttga 
tatgtgtttg tttgttttct gtactagcat 
aaaaatactt tcgagaatga cagggcaaag 
taggcactat tgacaaaaag caattgatgt 
tatttaaaag taattctctg aaatacaatt 
aaacaccaaa aaacttcctt atatttatca 
cttatttaat atatttttga ttatttaatt 
ccagtggtat ttgttcaaaa tattttaatg 
ttatcttaaa tgaaaatttt tggttaataa 
tctctgtgga tcctaaagtt tttagttgag 
cttgtaagct tggagaaatt tctgctaaat 
atacgaaata tataaatatt tcagctcata 
aatcataatt cctgacaaat attttggaag 
aattatatag actacttttg caaaccttta 
atttcattct ggtgcagaat ataagtttat 
tcctcccact gactgagata ttccaaagcc 
tacttctttg taccttcatt ttatttgttc 
atatttttct gttttcaagt tttgatttta 
ttttgtgagg ctatattcat tatgtgtctt 
tgcaataaaa attataagac tattaaaaat 
aaatgcagta ccagtagact acattcaggc 
tactttaaaa ttttagctaa tgatggagaa 
tagttggcca ttttgttttg ttttgagaga 
tggttttacc attttgactg tgagcaaaat 



atgttttgct tccaggttct atctccaagg 2220 
tttcagggca cacagtttcc cttttaggga 2280 
cacatttgaa tattcagtaa gtgaggtcct 2340 
acaatcttgc taatcactag atgtatgaga 2400 
taaaatcagg gatttaaatc aaagaaacat 2460 
gcatgggaga atcattgaaa aaaaaataaa 2520 
agaagaattt gttattctgg ttgatttttt 2580 
atctctttgt gattttcatt tcttgttaaa 2640 
agctgaatat aaaatttact acgagatgca 2700 
ttatcagtga ggtgagcatt cttaattcag 2760 
ttatgctatt gttgagctta actacttatt 2820 
catttcatta atttcagtac cgaacactaa 2880 
ttgcactatt tttaactata tatctgtatg 2940 
agcaattgaa actttcaatt acagtatact 3000 
atccattagc aatagcagat aatttttgta 3060 
gatgttcaaa Ctggctaagt tccttttctc 3120 
tttgtataat tattccttcc tctttagctc 3180 
actgggaaga aagtttttaa acataaacca 3240 
actattctct gtcacacata gaaagaaaga 3300 
ctatgatctg gtccaggctc tccctaaaat 3360 
tggccctcct tcaatctcat tagccttttc 3420 
aggtatttct ttttttcata ttttgccttt 3480 
aattgcatca tcatttctat atacgtgcta 3540 
gaccacctga tctagaatag tctcgatcaa 3600 
tatttatatc aaacatttat ttgtgtaatt 3660 
tatgatgacc atactatttg atgcccccca 3720 
ctaaaataat taaattatat aattttgaca 3780 
tatgatagtg ttagatctat gaaatagtac 3 840 
ttccaaaact aaaagcagca tatgtacatg 3 900 
ctggaagatt taaaatagta taagtagtaa 3960 
aattttatag tatccaactc taatataatg 4020 
ttgtctattt atttttaatt tgcctaaaaa 4080 
attcgaaaat actgaaaccc tcatctccag 4140 
aaaataattt ttctctagag aatgaagtag 4200 
aaatgatatt atcaactctt attttcttca 4260 
tatttttgca ggtgctatgc ttttgcttcc 4320 
tcaaaacttg tcttctattt tgttatttaa 4380 
tactatcaaa tcataggcaa tttcagtttg 4440 
ccaagtaaaa caggagtcac ttcaaaagat 4500 
aactttgcaa aatttcagaa ttaaatatta 4560 
aatttttctt tgtgtttgta gaaaatttta 4620 
atttactact ttataatttt taaaggtaag 4680 
gaataaagac atacaattaa ttttgagaac 4740 
gcagcaagtg tactacactt aggctgctaa 4800 
tgcttaaagt tagttcctct aagtaccata 4860 
caaagacaga aagactgtgt taccatattc 4 920 
cgccacatca gccttatcat aaaaattatt 4980 
atacagcata atatacaaaa taaaatatat 5040 



BNSDOCID: <WO 00584 70A1 J_> 



WO 00/58470 



PCT/US00/07906 



gtacaccttc acaacttctt gtttaggacg caattatata tatatatata tatatattta 5100 

ttattatact ttaagttcta gggcacacgg caccacgtgc aggctgttac atacgcatac 5160 

atgtgccacg ttggtgtgct gcacccatta actcgtcatt tacattaggt gtatctccta 5220 

acgctacccc tcccctctct ccccacccca caacaagccc cggtgtgtga tgttcccctt 5280 

cctgtgtcca tgtgttctca ttgttcaatt cccacctatg agtgagaaca cgcagtgttt 5340 

gcttttttgt ccttgcaata gcctgctgag aatgatggtt tccagcttca tccatgtccc 5400 

tacaaaggac atgaactcat cattttttat ggctgcatag tattccatgg tgtatatgtg 5460 

ccaccatttt cttaatccga gtctgcccat tgttgttgga catttgggtt gcaattttga 5520 

gtttcatgtg cagcatgtat agcacaacca accaagattt ctctctttct ctcttttttt 5580 

tttttttttg tcgaaatgga gtcttgcctg tctccaaggc tggagcccaa tggtgtgatc 5640 

ttggcttact gcaacctcca cctcccgggt tcaagcgatt ctcctgcctc agccatccga 5700 

gtagctggga ctataggcgt gcaccaccat gcccagctaa tttttgtatt tttagtacag 5760 

acggggtttc accacggtgg ccaggatggt ctcaatttct tgacctcatg attcacccgc 5820 

cttggcctcc caaagcgctg ggattacagg tgtgaaccac caagcccggc ctgtcacaag 5880 

tttttagtgt tctattttaa tacagaaatt agataaatcc aaagagaaag acattccata 5940 

tgtgcgtaga gttgtcggaa gaaatgagag tcttataaat aactttaaaa attgtgaaga 6000 

aataaaggca aaatagtcct atgcagtttg atttaaatat attcttaata agagctactt 6060 

ttgtgaaaac cagaatattg aaacatgtag atatggatct tcattagtga ctgacataat 6120 

atatcgtcac tgttactatt ttattgtatc agccaactaa tatcgagtgc tttgtgtatc 6180 

ctaagcacta tgctaaacac tgtaccagta ttacctgata taatcatatt aatatttatt 6240 

atttcacttt tcacatgaaa aaattgaagc acagattaag acactccgaa atcatacctc 6300 

cattgattat cagcaccagg atttgaattg aggcactctg atccagagaa gcttttgttt 6360 

ccatgaaggc ttatgttggg gaaaaacaat caaattgcct gtacctcagt tgtataaata 6420 

agaggttggg ttggtagatg attctggctg attcagcaga aaagaaattt attcaaagga 6480 

tatcacacag ttttcataac agttaagaat acagaggaaa cagggcacca gggctaagta 6540 

cagaccaaag tccaaaacca ctgccaaagt tgcagcaagg agaacagcac aaatttgctt 6600 

gctgtcaccc gccactagat gcttttgttt ggagccttga acttgactta cactgccact 6660 

gacatcagca ccagtgctct ctgtgtacta ggaggtggag ttggcgacgt tgctgaactc 6720 

aaagcagatg tttctgctgt gaaatagata cctaatacag aacctgcttc ctcattcatt 6780 

ccctccccaa atcatatgct tgtagtgtgg ctagagtttc tgtttctcct tggtccaggc 6840 

agaatttatg aagcttgcta tttatcgcct taaagattag aagaatattc ataaggtatt 6900 

agattgccat aaggttgaac aaatcaacat tcaacttcaa ggattcaaca ttgttttgtt 6960 

ttcttttggg atacctctgc agcagttcaa atcttatttc tgcccttgga caaccaggtt 7020 

tataaatatt gcagattctc cactgactgc tttgatccta tcttctatat ttatgtatac 7080 

taattagcat ataataaaag attatgttac agaatctcaa aattagtaat tatgaattga 7140 

gatggtgtta tacagtacac taacatccaa gagacttgtt tattccaagg aaaatattta 7200 

gagatattaa atgatatttc tcatccttta gacatataca ttttttagct tacagcctgc 7260 

tttaggcaag caacagactc tcaggatctg ctcctaccag ggtctgaaca tttcctccca 7320 

gttttaaaga aacaaattca aataacattg taacctccag aggaaagttc aagctctttt 7380 

atagtattgt ttaaacagta cagctgagga aactaaagac agagaagtta aatgccttgg 7440 

cacttagtct agatttacaa taaactccty tctacttagg acccactaac aggggctgca 7500 

tttacaccaa aaccatgaag gtggcccaag tcatcactga gaagtagtac aagcaccgag 7560 

ggaatgactt caacaggaac aagaaagcgt ggaaggagat cctagcagga agctccacaa 7620 

gaagatagca tgttacgtct tgcattggat gaagcaggtt cagagagacc tagcgacagc 7680 

tatctccgtc aaggtgcaga aggagagatc attgaatgta gcattttcat gcaaaaaaaa 7740 

aaatgttgaa gtctttggac ttcgggagtc tgtccaaact gcaggtcact cagcctacag 7800 

ttgggatgaa tttcaaaaca ccagttggag ccggttgaat ctttctgcta tgctgtaata 7860 

ttttcagtaa acccagcgca acaacaacaa caaaacacaa aaggaggaga agcagccaag 7920 
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tctcctggtt tacagagtag ctcctaatac 
gaagatagtc aaaacaatat tcacacctgt 
atctttatat actgcatatt aaggatctgt 
aactccttgt tttgacttat caaagtcctt 
ttcaaatttt agctttatat tatcacttga 
attcactgac aacctacaga caattcccat 
aacaaaattg agggttttcc ttacatgttt 
catagatctt cattgaggat tcgcatgtga 
ttatcaacca agttccataa atcatgaaca 
ccaccacatc ccttgtaata aacacagagc 
cggttcaggg cctgctgagt ggcactcagt 
ctttattcca gagcatcaat ggccaaggct 
ggtctcttat ggcctgccaa ttttcacagt 
agacctgtta gaaaaatgcc ggttggaata 
atttcacctc atttttatag gacttgagta 
aatctgaatg ttaggacacc aaatatctcc 
ataataaacc tggggccact gcaggcctca 
gaggaggaaa tgccaatgcc gcacaaatct 
ggcttggtgc atttgacact tcatgatatc 
aagaggacta tgacatcatc aggttgggag 
gactagaaag tgtgaaatct attagtcttc 
atttcatatt tacaagacac aggcctactc 
cagagggaaa agagatcatg aggcatttca 
tgccgtatgt ttggttttat tttggccaag 
ggagaaaaca aaggtgcctt tgccaatgtt 
tgaggggaat tcacataaat gggatcaggt 
taaagataca acctttgcag agagcatgct 
ggtaaagcat tcttcaagaa tcagttaatc 
cccttgttgc aaatattggt tatattgatt 
cttacagatg aacaaaccca caaaagcaaa 
tttctgtgta actggattaa aggattcctg 
atttccaggt attgtttaaa atgagggccc 
attctaaaat tcagctggac attgtctaat 
caaattttaa gtatgtgaat ttagttaatt 
ttcctaaatt ttagactatg aaggtcataa 
tagattgttt ttcattataa tgtcttcatc 
gaaaactgag agttttaatt cacaattacg 
attaattgta tgtgattgtt tttatgtaca 
cttttgtact tagtcccatg tttaaaatta 
caattcaatt tacatgatgc taattatggc 
agaatatgtg aattgttcac catacataga 
ctgggtgaaa gagtgctttt gattgaaaga 
gtccctaatc tacccttaat agccatgtgg 
ctcctcatac ttttttcatc tctaaaatga 
tttttttgag atagagtttt gcccttgtca 
gctcactgca accctctgct tccccggctc 
tgagcccggg attacaggtg cccgccacca 
catgttggcc aggctggtct cgaaccccta 



cccttgctgt ccgtctcaag tgcccaatgg 7980 
gattcatctc tctacacgca gtgtgtgtga 8040 
ctttacagat aaaaactaaa gcattgaagg 8100 
aagaaaatac tagaaaacta tagccattgt 8160 
aatgtgatga aacgcggccg atagataata 8220 
ctcaaaatgg accattggat tgaagaacca 8280 
tgtctaaaga gcgaagtaga aacaactgcc 8340 
agtaagtact cctaacacaa acaagtggac 8400 
aaaatatttg tccccagaga gactattttt 8460 
ccagttcagt taaaatagtt taagggtgga 8520 
aagaaaaccc agcagaacat ttacttctct 8580 
ggaagatccc agaacaccga acagacattt 8640 
gggttccaac gctttgggtc aaaccaaaat 8700 
cgctaacaat aagacagaat aaatgtgatt 8760 
attttattat aacattcttg agggctggaa 8820 
agaaaacaag ttttatattt ctaatcctgc 8880 
ttaataaaaa cctaatggta taacaataat 8940 
gttgagacta aaatatttct caccccagca 9000 
agccaaagtg gaactaaaaa cagctcctgg 9060 
tctccaggga cagcggaccc tttggaaaag 9120 
gatatgaaat tctctgtctc tgtcaaaagc 9180 
ctagggcagc aaaaagcggc aacaggcaag 9240 
gagtgcactg tcttttcata tatttctcaa 9300 
cataacaatc tgctcaagaa aaaaaaatct 9360 
atgtttcttt ttgacaagcc ctgagatttc 9420 
cattcattta cgttgtgtgc aaatatgatt 9480 
ttcctaaggg taggcacgtg gaggactaag 9540 
aaagaaaggt gctctttgca ttctgaaatg 9600 
aaatttacac ttaatggaaa caacctttaa 9660 
aaatcaaaag ccctacctat gatttcatat 9720 
cttgcttttg ggcataaatg ataatggaat 9780 
atctacaaat ccttagcaat actttggata 9840 
tgttttttat atacatcttt gctagaattt 9900 
agctgtgctg atcaattcaa aaacattact 9960 
attcaacaaa tatatctaca catacaatta 10020 
ttaacagaat tgtctttgtg attgttttta 10080 
ttgatcaaaa aattgtggga acaatccagc 10140 
taaggagtct taagcttggt gccttgaagt 10200 
ctactttata tctaaagcat ttatgttttt 10260 
aattataaca aatattaaag atttcgaaat 10320 
aatgaaaagt tcatttcgta aagcaagatg 10380 
tcactagatt agtagagggc aagactttta 1044 0 
tcacgtgtaa gtcagtgaac ccatctcatt 10500 
gggcataatt taagctcgtt catttttttt 10560 
cccaggttgg agcgcaacgg cacgatctca 10620 
aagtgattct ccctgctcca gcctcccaag 10680 
catctgggcc tagatttttt gtattttcac 10740 
cctcaggtga tccctcgcct cggcctctca 10800 
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aagtgctggg attacaggtg cgagccacca cgcccagccc aatatcagtt tttctttttt 1086C 
aacacaaggc caacacaatc aaaacactag ccaggggaga aaaaaaaaat aaggcactgc 10920 
ttatgtgtaa caggctctcg ttgcaatcca ctggggcaga ccaaacaaac agcaagaacc 10980 
aaatcctttt cacacaaccc tttccttcca gaacacacaa aacccccaca aatggcctat 11040 
cttccctttt acgatatgtt ggagaattgt agccaagtga cagatatttt gcctgggcgt 11100 
acagaccaca aaggaccgtg tcttgatgat ggcttgcara aaattatacc ttagttttta 11160 
ctttgtatgt tacatgttag atttagagta cgaaaattag tagggaggat taccaacaaa 11220 
gaacagggca agaggagtag aattaaaccn ctcctaatac ctgtgcacaa gtaggctttt 11280 
cagaaaccct acaaccccaa cataaactgg atagttagaa aagcacactc ccaaggaagg 11340 
cggctatgtt ttgcagtttg aatcagaaga acagagctat agcaatcttc attctatagt 11400 
aacatcaaag agcctggttt acattacagc agtcattaag atttaaaaat ttacatctcg 11460 
ccgttcttct cactcacaga ttttcgagag gtaatgtaat gaccacacga ggcgagaatc 11520 
actgcctttt ataatgcgat taaatgcatg aacaaagttt ccaacaaata acagtaataa 11580 
aaagaaacac gtattagcac ttaataagcc aggtgctgta cgacgcgtgt tacatgcttt 11640 
caatccatga actggtaaac tggtactagt atctcnattg gacacgtgag gaaaccaaat 11700 
ggagttgata aacagcagag ttaaaaatta ctcttcatat attatattgc ctcaatctca 11760 
cagacatctc tgccaccaaa agctaccata tctagactcg a 11801 



<210> 9 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 9 

tggcaacagg caagcagag 19 



<210> 10 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 10 

ggccaaaata aaaccaaaca t 21 



<210> 11 
<211> 24 
<212> DNA 

<213> Artificial Sequence 



9 
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<220> 

<222> Description of Artificial Sequence : Probe/Primer 
<400> 11 

gcaaatatga tttaaagata caac 24 



<210> 12 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 12 

ggttgtatct ttaaatcata tttgc 25 



<210> 13 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/ Primer 
<400> 13 

actgtctttt catatatttc tcaatgc 27 



<210> 14 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence -. Probe/ Primer 
<400> 14 

aagtagtaat tttaaacatg ggac 24 



<210> 15 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



10 
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<220> 

<223> Description of Artificial Sequence : Probe/ Primer 
<400> 15 

tctttcaatt acgcagcaac c 21 



<210> 16 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 16 

gaattgtctt tgtgattgtt tttag 25 



c210> 17 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 17 

caattcacaa agacaattca gttaag 26 



<210> 18 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 16 

acaattagac aatgtccagc tga 23 



<210> 19 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<222> Description of Artificial Sequence : Probe/ Primer 
<400> 19 

ctttggctga tatcatgaag tgtc 24 



<210> 20 
<21i> 2 3 
<212> DNA 

c2X3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 20 

aaccttttgc cctatgccgt aac 



<210> 21 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/Primer 
<400> 21 

gagactccca acctgatgat gt 22 



<210> 22 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Probe/ Primer 
<400> 22 

ggtcacgttg agtcccagtg 20 
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